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Two main areas of code development have been 


undertaken. The first is the implementation of CASSCF and 
SCF analytical first derivatives on the CRAY X-liP. Codes 

-for -performing geometry -optimization -using the -analytical 

gradients have also been installed and interfaced to the 
integral program and the wave function generation programs. 
As a result it is possible to obtain equilibrium geometries 
and saddle points with little input from the user beyond an 
initial guess, and with greatly increased efficiency 
relative to methods based on computing grids of energy 
values . 

The second major area of code development has 
been the installation of the complete set of electronic 
structure codes on the CRAY 2, an activity carried out in 
collaboration with Dr C. W. Bauschlicher jr. Particular 
effort was required to make the CASSCF and multireference 
Cl programs operational, as a result of numerous compiler 
bugs and the incomplete version of FORTRAN offered. The 
gradient program was much less troublesome. In addition, a 
new scheme for performing Hartree-Fock calculations with 
the integral list in memory has been implemented on the 
CRAY 2. This has been used for calculations with 300 basis 
functions locally and with basis sets of double this size 
in Minneapolis. Finally, a proposed method for extending 
the direct SCF approach to permit beyond-Har tree-Fock 
calculations has been written up for publication. 



In the area of application calculations the main 
effort has been devoted to performing full Cl calculations 
using the CRAY 2 and using these results to benchmark 
other methods. The main observation of this work has been 
the generally excellent agreement between multireference 
Cl results and the full Cl. The attached preprints 
describe some of the systems studied, other work is 
presently being written up for publication. In addition, 
calculations on the recombination of H with OH have been 
commenced, with particular emphasis at this stage on the 
choice of active space for CASSCF and mul t i ref erence Cl 


wave functions. 



/kp&t/bixi. 

N86 - 30374 

Special note to printer on MS “Integral Processing in beyond-Hartree-Fock calcu- 
lations, by Peter R. Taylor” , ; 
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Integral processing in beyond-Hartree-Fock calculations 


Peter R. Taylor 
ELORET Institute* 
Sunnyvale, CA 94087 


Abstract 

The increasing rate at which improvements in processing capacity outstrip 
improvements in input/output performance of large computers has led to recent at- 
tempts to bypass generation of a disk-based integral file. The “direct” SCF method 
of Almlof and co-workers represents a very successful implementation of this ap- 
proach. -The present work is concerned with the extension of this general approach 
to Cl and MCSCF calculations. After a discussion of the particular types of MO 
integrals for which — at least for most current generation machines — disk-based 
storage seems unavoidable, it is shown how all the necessary integrals can be ob- 
tained as matrix elements of Coulomb and exchange operators that can be calcu- 
lated using a direct approach. Computational implementations of such a scheme 
are discussed. 


* Mailing address: NASA Ames Research Center, Moffett Field, CA 94035 
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I. Introduction 
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One of the most interesting recent developments in computational quantum 
chemistry is the “direct SCF” approach of Almlof, Faegri and Korsell (AFK) [l]. 
Recognizing that in some circumstances it may not be feasible to generate a disk 
file of two-electron integrals (or supermatrix elements) to be used repeatedly in 
subsequent SCF iterations, AFK suggested that the two-electron integrals be re- 
calculated in each SCF iteration. That is, the Fock matrix contributions from 
each batch of integrals are computed and then each batch is discarded. There are 
two distinct sets of circumstances where this strategy should prove advantageous. 
Where computations are performed using an “in-house” minicomputer it will often 
be the case that the available disk storage is inadequate for large basis sets (150 
200 CGT.Os, say), or that the performance of the input /output (10) system is 
too low for the integral file to be processed in an acceptable real time. Alterna- 
tively, where computations are being done on a supercomputer (or even a large 
conventional mainframe computer), the available disk system capacity and/or per- 
formance may not be adequate for the size of basis set (300 or more CGTOs) for 
which the integral generation time would be acceptable. (Even the arrival of large 
primary memories, such as the 268 million words available on the CRAY 2, does 
not provide a complete solution to the problem of storing the integrals). The direct 
SCF method has proved its worth in both sets of circumstances: basis sets of over 
300 CGTOs having been handled on NORD 500 and VAX 11/780 minicomputers 
and over 500 CGTOs on an IBM 3033 [l]. Of course, AFK’s implementation of the 
direct SCF method incorporates a number of factors designed to improve overall 
performance. The full symmetry of the nuclear framework is used to minimize the 
number of distinct two-electron integrals which might have to be calculated, while 
density matrix pre-screening techniques are used to avoid calculation of integrals 
which contribute negligibly to the Fock matrix [1,2]. 

While it is very desirable to have a method of this type, there are, of course, 
many chemical problems for which correlation effects play an important role. Conse- 
quently, it seems appropriate to explore schemes whereby a similar general approach 
— that is, recalculating integrals when they are required — could be taken for Cl 
and MCSCF methods. This work presents an approach in which it is assumed that 
some integrals are required so frequently that it would be inefficient to recompute 
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them repeatedly, while other integrals can be recomputed as required. Clearly, there 
is an operational difference between this approach and the philosophy behind direct - 
SCF, as in the latter it is assumed that all integrals are in the same class as far as 
frequency of use is concerned.* 

II. MO Integrals in Direct MR-CI(SD). 

The various types of MO integrals appearing in single and double excitation 
direct Cl calculations have been discussed in detail by Siegbahn [3], Ahlrichs [4] and 
Saunders and van Lenthe [5]. These treatments cover not only the cases of one or 
several reference configurations, but also the case where the reference configuration 
is “internally contracted” [4,6-8]. It is clear from these treatments that it is desirable 
to have the integrals \ij\kl j, [ij\ka], \ij\ab\ and [ia\jb\ (where denote MOs 

occupied in at least one reference configuration, and a, b... the remaining MOs; 
charge density notation has been used for the integrals) available in the MO basis: 
these integrals contribute to many different terms in different ways. If Coulomb 
operator matrices J lJ . and exchange operator matrices K tJ± are defined via the 
matrix elements 

(p\J tJ \q) = [ij\pq\ (1) 

{p\K' J+ \q) = MN + Nip] (2) 

[p\K X] ~\q) = \ip\jq\ - Nip] (3) 

for p.q... arbitrary MOs, the above required integrals are all included in J 1 - 7 , K t; + 
and K u ~, i > j. 

* A note on terminology may be appropriate here. The expression “direct Cl” has an 
accepted and widely understood meaning: it refers to a Cl calculation in which the 
Hamiltonian matrix is never computed explicitly and stored. It is not unreasonable 
to use the term “direct SCF” for an SCF calculations in which the AO integral list 
is not stored. The expression “direct MCSCF” is closer in meaning to direct Cl: 
the Hessian is not computed explicitly. It is difficult to combine these meanings to 
cover the sort of method proposed in this work, and while this author has previously 
used the term “direct direct Cl” this is both ugly and confusing. No convenient 
alternative readily presents itself, however. 


4 





The remaining possible integral types, \ia\bc) and \ab\cd), are not required in the 
MO basis for direct Cl calculations [4,5]. Their contribution to the residual vector 
<7 = He can be written in terms of AO integrals for a suitable renormalization of c 
[4,5,8]. This approach is discussed further in section V below. 

It appears, therefore, that if a direct Cl scheme is used for optimization of an 
MR-CI(SD) wave function, the integrals which must be computed (and stored) in 
the MO basis are just the and K t;± for correlated. MOs i > j. It should be noted 
that while the term “Cl” has been used here, all of the above remarks apply also to 
methods based on the coupled-pair many-electron theory of Cizek [9]. This includes 
both coupled-cluster methods and approximate CEPA-type schemes [10-13]. 

III. MO integrals in MCSCF calculations. 

The question of which integral types need to be transformed into the MO basis 
has been investigated in some detail by Almlof and Taylor [2] . Their conclusion is 
that it is generally necessary to have imatrix elements of J tu and K tU:t in the MO 
basis: here t and u denote partially occupied ( active ) MOs in the MCSCF wave 
function. The availability of operators J tJ , J lt and K lt± (here denote 

doubly occupied ( inactive ) MOs) in the MO basis allows a very simple formulation of 
the MCSCF orbital optimization problem (see e.g. refs 14 and 15), but in a “direct” 
MCSCF formulation [14] it is always possible to rewrite the contributions of these 
operators in terms of the AO integrals [2,16] . Elements of J tu and K fu± are needed 
for the Cl step (or, in a full second-order treatment, the Cl sub-block of the Hessian 
and the Cl gradient term) and for some Cl-orbital rotation coupling terms, and most 
of these contributions are awkward to reformulate in terms of AO integrals. In this 
way, each cycle of the MCSCF optimization requires construction of 3 tu and K tu± 
once, followed by contraction of a supermatrix with quantities similar to density 
matrices. This contraction must be performed in every micro-iteration through the 
MCSCF linear equation system if a full second-order optimization is performed — 
for first-order schemes [15,17,18] intermediate Fock-type operator matrices can be 
constructed with one such contraction step and then re-used within the given cycle. 
Full details are given in ref 18. 

For a second-order MCSCF scheme with the minimum number of integrals 
stored on disk, therefore, it will be necessary to recompute the AO integrals in every 
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micro-iteration of every cycle. In individual cases it may be preferable to construct 
J lj , K lJ± , J lt and K if± once in each cycle, and then to process all the integrals in 
the MO basis: this would depend on the balance between the transformation labour 
to obtain these operators (and how many there are) and the integral evaluation time. 
For large extended systems it may be that sparseness in the integral list combined 
with pre-screening of density matrices might make the completely direct MCSCF 
approach favourable. Dynamic adjustment of the number of micro-iterations used 
in a given cycle (solving the linear equations less accurately when far from overall 
convergence) will also improve performance. In any event, for the purposes of the 
present discussion it is clear that the problem of generating MO integrals for use in 
an MCSCF calculation is equivalent to that of a CFcalculation: J and operators 
over certain occupied MOs must be available. 

IV. Construction of operator matrices. 

Where the AO integral list is available, and disk capacity or performance is 
adequate, the most efficient route to the required J and K* matrices is via a limited 
four-index transformation [4,5] (see also ref 19 and refs therein), performed as the 
four quarter-transformations 


[iu\\a\ = y^[ju^|Acr]C M t 

(4a) 

[ij\Xo\ = U j 

V 

(46) 

\ij\po] = ^[u'|A<t]Ca p 

A 

(4c) 

1 tj\pg] = ^2[ij\P°]C<rq 

M 


a 


for the element Here fi, v, A and a denote AOs and C is the matrix of MO 
coefficients. The most time-consuming of the four steps is (4a), which behaves as 
nN 4 operations for n active or correlated MOs and N AOs; (46 — d) behave as 
n 2 iV 3 . Similar behaviour is obtained for calculation of K l J± provided that the AO 
integrals are sorted differently before the transformation. 

A less efficient (in terms of floating-point operations) procedure essentially in- 
volves combining the first two quarter-transformations into a single step, generating, 
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say. 

W>) = £ (5) 

A o 

and then transforming /x and u to the MO basis. Defining “density matrices” D tJ 
via 

D% = C^j ( 6 ) 

allows (5) to be viewed as contraction of integrals with a density matrix, analogous 
to Fock matrix construction in an SCF calculation. (5) behaves as n 2 N 4 , that is, 
some n times worse than (4). However, a scalar implementation of (5) requires no 
sorting of the AO integrals, and there is no need to expand the integral list beyond 
the normal canonical indexing p > i', A > a and (/ xu ) > (Act). 

Consider now an approach in which AO integrals are computed, used in some 
transformation process and then discarded, without being written to disk and re- 
read. If the n 2 N 4 process defined by (5) is used, it will be possible to hold simul- 
taneously 2L/N(N + 1) operator arrays J or K ± in L words of memory. As there 
are some | n 2 operators in toto to be constructed, it will be necessary to generate 
the integrals 3n 2 N 2 j4L times. For 200 AOs, 20 correlated or active MOs and 4 
million words of memory some 3 passes would be required, however, a 50% increase 
in n or N results in a factor of 2 increase in the number of passes, as would a 50% 
reduction in the memory available. The n 2 and N 2 scaling in the number of passes 
is clearly a considerable disadvantage of the n 2 N 4 approach. 

On the other hand, by defining a “test density” as 

DxV 1 = maxIC^t'CVyl, (7) 

(u! 

where the notation [ij] denotes all MO pairs whose operators are being processed in 
the current pass, an effective pre-screening technique can be implemented to decide 
whether a particular \fxv\\a] need be calculated. (This process is readily extended 
to the case of calculating AO integrals in shells, as discussed below and in refs 1 
and 2). Clearly, as n or N increases, the number of operators generated in each 
pass decreases. It may be expected that, in turn, the sparsity of will increase 
(certainly D tes< cannot become less sparse) which will decrease the number of AO 
integrals to be calculated in each pass. This phenomenon will tend to offset the 
effect of the n 2 and N 2 scaling discussed above, and will play an important role 
when localized MOs are used. 
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Completion of the transformation of the J u , etc, is also simple in the case of 
the n 2 N 4 approach. Each operator matrix, once constructed in the AO basis, can 
be transformed to the MO basis and then written to disk directly. No additional 
sorting is required and the final operator matrices are in exactly the form required 
for “matrix-formulated” direct Cl [4,5,20]. Typical loop structures for constructing 
various operators are discussed in section VI below. 

In an implementation of the nN 4 scheme different procedures must be followed 
for the J and K cases. For J operators, it is necessary to compute blocks of integrals 
[/xi/|Acr], for all n > u and for as many Aa(A > a) pairs as will fit in L words of 
memory. It is then possible to carry out the first two quarter-transformations (4a, b) 
for all ij (i > j) pairs. The resultant [fjjAa] must then be written to disk, so that 
once all of the [ij | Act] are available they can be re-sorted to AO J matrices for the 
final half-transformation. Note that in the AO integral generation it is not possible 
to restrict consideration to the case [nv) > (Aa) (the normal canonical ordering): 
effectively, the integrals must be computed twice. For K ± integral blocks \nv\\o], 
with all //A and for as many i/o[u > c) as can be held in memory, are transformed 
to [u/|y<7] ± [fcjjV] for all i > j. Again, these half-transformed integrals must be 
re-sorted for the final transformation. Clearly, this latter ordering of \ixv\\o] is 
different from the J case and the n 2 N 4 scheme. Indeed, it not only differs from the 
conventional ordering used in integral programs, but it also involves some redundant 
recomputation of integrals because of the need to have all fi\ pairs, not just /x > A. 
Essentially, the AO integrals must be computed four times. There are thus not only 
disk and 10 overheads associated with the nN 4 scheme, but also additional CPU 
costs occasioned by recomputation of integrals. It will depend on the individual case 
whether these additional overheads offset the much more favourable floating-point 
behaviour of the transformation step relative to the n 2 N 4 scheme. It should be 
noted that the disk space (and 10 required) behave as n 2 N 2 , which is usually very 
much less than the N 4 requirements for the initial sorting of a disk-based integral list 
for a conventional transformation. A disadvantage of the suggested implementation 
of the nN 4 procedure is that it is not possible to make as much use of pre-screening 
as in the n 2 N 4 case. This is because the first half-transformation is used to produce 
[ij j Act] for all ij from [/xz/|Aa] for all /xx/: the effective “test density” analogous to (6) 
would involve all ij pairs and would thus be as dense as the worst possible case for 
the n 2 N 4 scheme. It is quite conceivable that in some cases, such as large extended 
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organic systems, the n 2 N 4 approach with its effective pre-screening would be the 
method of choice, while for relatively compact systems of heavier atoms, such as 
polynuclear transition metal complexes, the nN 4 approach would be preferable. 

V. External Exchange Operators.- 

As was noted above, it has been pointed out by several authors [4,5,8] that the 
direct Cl contribution of the MO integrals [a6|cf] and [a6|cd] can be evaluated in 
the AO basis using the operator matrices K p with elements 

KS, = X>|jr'»c,..c„6, (8) 

where 

{n\K p \u) = ^[MA|i/a]V£ , ' (9) 

A a 

with 

~ V C P p C \ p C a q ( 10 ) 

P <1 

The “Cl coefficient” array c is obtained as follows. For doubly-excited CSFs which 
differ only in virtual MO occupation (i.e. all have the same virtual MO spin- 
coupling and the same (N e - 2)-electron occupied MO part P (for N e electrons 
correlated)) the various Cl coefficients Cp‘ are collected into the array c p which is 
then renormalized to give cp according to refs [4,5] . We then have 

C l p = C r p -r ' S ^bip&dqC < QB i( 2 d (ll) 

Q 

where B^ d is a two-electron coupling coefficient and Cq is the Cl coefficient of a 
singly-excited CSF. 

Clearly, the construction of K p in the AO basis using (8 - 11) parallels the 
construction of the K tJ operators via the n 2 N 4 scheme outlined above in section 
IV. Indeed, by explicitly recognizing that the two virtual MOs can be either singlet 
or triplet coupled it is possible to proceed via K p± operators obtained from sums 
and differences of integrals as in eqns (2) and (3). Pre-screening via a test density 
matrix can be used to reduce the number of AO integrals which must be calculated, 
offsetting in part the n 2 N 4 dependence of the K p generation. However, the exter- 
nal exchange operator construction must be performed in each Cl iteration, which 
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(when the time taken to re-evaluate the integrals is included) is likely to lead to its 
dominating the timing for calculations with large basis sets. 

It is also possible to consider an alternative scheme for computing the contri- 
bution from the external exchange operators which shares features with the nTV 4 
scheme for J ,J and K t;± . It is possible to form arrays K cd according to 

*# = £&* IHCacC^ (12) 

A cr 

and then, without any intermediate 10, to combine these half-transformed integrals 
with Cl-coefficients as 

= £ £ K % c r- < 13 > 

c d 

The Kp U would be written out to disk for re.-sorting. The strategy would be to hold 
all A <7 values in memory (in (12)) for as many fxu values as possible. The floating- 
point behaviour of (12) (assuming that in practice it would be performed as two 
(Successive quarter-transformations) is (TV — n)N 4 , while that of (13) is essentially 
n 2 (TV — n) 2 N 2 . Of course, the same recomputation of integrals is required for (12) 
as for the nN 4 approach to construction of K t;± matrices discussed in the previous 
section. 

For the case of the “externally contracted” Cl method of Siegbahn [7], integrals 
such as \ac\bd} are used not simply to form K[ b but rather to form A p where 

Ar-££*54‘ (H) 

a b 

Here c a p is a Cl coefficient in a wave function obtained in the lowest order of 
perturbation theory. A p need be constructed only once during the contracted Cl 
calculation, and thus there is a very considerable advantage over the normal Cl 
methods, since these require recalculation of the external exchange contribution in 
each iteration. 

VI. Treatment of symmetry 

The direct SCF implementation of AFK benefits enormously from the exploita- 
tion of symmetry. This is used to reduce the number of distinct integrals which must 
be computed, and to reduce the dimensions of the various matrices which must 
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be processed. It is well known that the incorporation of symmetry considerably 
improves the efficiency of conventional 4-index transformation and Cl programs, 
and it is certainly desirable to extend these improvements to the present approach 
to beyond-Hartree-Fock methods. This is not difficult, although there are several 
points worthy of note. 

First, the operators J l] and K l]± will not always transform according to the 
totally symmetric irreducible representation of the molecular point group, G. Thus 

RJ'iR' = Y,D%i(Ryj‘’\ (15) 

A 

where R € G, and D^^R)' is an element of a representation matrix for a, which 
may not be an irreducible representation. By choosing appropriate combinations of 
ij and their partner MOs in degenerate irreducible representations, it is possible to 
restrict attention to the case of a irreducible. In (15), therefore, J lJ would represent 
a combination of J operators which transform according to row K of irreducible 
representation a. In an SCF calculation, the Fock and density matrices transform 
according to the case of a being the totally symmetric irreducible representation, 
and for this case a straightforward scheme for using a list of symmetry-distinct AO 
integrals to construct “skeleton” matrices which are later symmetrized to give the 
full result has been derived by Dupuis and King [21], based on earlier work by Dacre 
[22] and Elder [23]. The present author has extended the Dupuis and King scheme 
to the case of non-totally symmetric operators [24]. The only difficulty that arises 
in this extension is the need for full representation matrices (not merely characters) 
in the symmetrization of the skeleton matrices. These can be calculated from the 
characters of the group and a chain of subgroups by an ingenious method due to 
Hurley [25]. 

It is thus possible, to use the technique of ref 24 to generate integrals over MOs 
from a list of symmetry-distinct AO integrals. Use of the n 2 N 4 scheme (5) for 
the transformation step leads to very similar processing as in the SCF case, as for 
(5) there is no need to order the integrals. Fig 1 shows the loop structure of an 
integral routine designed to implement this scheme. The loop structure is greatly 
simplified: most codes would feature double loops over centres and then shells on 
those centres. Loops over shell components have not been shown explicitly. In the 
figures, the stabilizer [24] of a shell or centre is that subgroup of G under which 
the centre is invariant. Distinct integrals are generated in terms of double coset 
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representatives for various pairs of stabilizers: for full details the reader is referred 
to Davidson [26]. As far as the overall loop structure of Fig 1 is concerned there is 
essentially no change from the SCF case, for which the statements in the innermost 
loop would simply add or subtract appropriate Fock matrix contributions. 

It is also possible to handle symmetry in J operator construction by the nN 4 
scheme (4) straightforwardly and a possible loop structure is given in Fig 2. How- 
ever, complications ensue for the K ± operators. This is because integral evaluation 
schemes are based on charge densities (products of basis functions) and determining 
the symmetry-distinct AO integral list is also based on charge densities. Such an 
approach naturally works for J operators, since what is required is a list oi [nu\\o\ 
with ixw fixed and all A a, and this is simply all charge densities A a for the single 
charge density yiu. Symmetry-distinct integrals are obtained from [nRi'\T{\Sa)\, 
where R.S and T are operators from the point group: the range of operators giv- 
ing distinct integrals is determined by the symmetry transformation properties of 
the points on which the AOs are centred. Again, it is simple to work in terms of 
unique charge distributions (j,Rv and A So and their transforms, and to form all 
T(\So) for a fixed fxRi /. For K ± operators, however, what is needed from the list 
\fj,Ru\T(\Scr)] are terms with nT A fixed and all possible RuTSo. Not only is this 
clearly not charge distribution based, but the range of T operators giving distinct 
integrals cannot be determined until //, u, A, <7, R and S are known. This compli- 
cates the loop structure of the integral program, and, since it is usually desirable to 
compute information about charge distributions in the outermost possible loop, it 
will be necessary either to compute this information in inner loops or to compute 
information about all possible charge distributions in the outer loops, performing 
redundant work since some of these distributions will turn out to be non-unique. 
A nN 4 scheme loop structure for K~ operators, incorporating symmetry, is given 
in Fig 3, and the problems associated with K ± operators can be clearly seen by 
comparing Fig 3 with Fig 2. 


VII. Computational considerations 

The need for repeated calculation of AO integrals, particularly in implementa- 
tions of the n?N 4 transformation procedure (5), suggests that a primary goal must 
be an efficient integral evaluations scheme. This problem has received considerable 
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attention in the last fifteen years [27-29], and a number of very efficient schemes 
have been devised. A key feature of these schemes is the use of shells of basis func- 
tions, a shell being defined by a set of contracted Gaussian functions of the same L 
value, located on the same centre, with the same exponents and contraction coeffi- 
cients but differing in their angular behaviour. Integrals over four such shells — a 
“shell block” of integrals — share many common factors, and avoiding redundant 
recomputation of these factors results in a substantial increase in efficiency. Such 
use of shells rather than individual basis functions is implicit in the loop structures 
of Figs 1-3. The use of shells requires a modification of the pre-screening proce- 
dure: clearly, as long as one integral in a shell block is required it will be necessary 
to compute the entire block. It is therefore convenient to define test densities (7) 
for shells rather than basis functions. Thus 

D'mh = max \CfiiCi/j\, n e M.u € N (16) 

(tj'l 

for shells M and N. 

Most AO integral evaluation schemes are rather readily vectorized [30]. Integral 
evaluation is also a task which is suited to parallel architectures [30]. For the rest 
of this section, therefore, we shall assume that the problem of efficient integral 
evaluation has been solved and concentrate on the processing of the AO integrals 
once they are available. 

The nN 4 transformation (4) is vectorizable in terms of successive matrix mul- 
tiplications in which the innermost loop is of order N . For vector processors such as 
the CRAY machines, multiplication of matrices of this order leads to performance 
close to the theoretical maximum. For computers that require greater vector lengths 
to achieve maximum performance it is possible to write (4) as a set of “vector = 
vector + scalar*vector” (SAXPY [31]) operations of length n 2 to N 2 or even n 2 N 
to TV 3 [2]. It is also possible to perform the first half-transformation (4a, 6) effi- 
ciently on a parallel architecture, by generating and processing subsets of integrals 
(such as [^|Aa], V A > a and fixed / 1 > u) on each processor. However, the re- 
ordering and subsequent processing of the half-transformed integrals will require 
considerable data movement between processors; and the overall efficiency will de- 
pend critically on the speed of inter-processor communication [32]. For machines 
with a large common memory or solid-state disk this will obviously be much less of 
a problem than for polytope architectures, such as hypercubes, with relatively slow 
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data paths between nodes. 

The n 2 N 4 scheme (5) is straightforward to vectorize (in terms of SAXPYs) 
on the number of operator matrices which can be held in memory simultaneously. 
The maximum possible value is |n 2 , when all J, K + and K~ operators can be 
processed in one pass. For large basis sets the memory -requirements would usually 
be prohibitive, and a subrange of operators would be processed in each pass. This 
may lead to vector lengths too short for efficient processing. This scheme is very 
easy to adapt to parallel architectures: each processor simply generates a subset of 
the J lJ , etc. although this requires each processor to generate all the AO integrals 
if inter-processor communication is to be avoided. Of course, for multi-processor 
architectures with common memory, such as the CRAY X-MP or CRAY 2 the 
latter problem does not arise. 

It is clear that similar reasoning can be applied to the external exchange con- 
tribution discussed in section V. Indeed, some additional steps which arise in this 
case, such as (10) and (13), are also readily vectorized. It therefore seems that 
processing of integrals along the lines described here can be made very efficient on 
most current generation computing machinery. 

Finally, it may be useful to give an example of the data storage and recalcula- 
tion requirements in a large Cl calculation using the schemes suggested here. We 
consider a calculation on the molecule Fe(CO)s, similar to the largest calculations 
reported by Liithi and co-workers [33], but using a larger basis. Assuming that an 
[8s6p4dl/] basis is used for Fe and a [4s2pld] basis for C and O, there will be 233 
AOs (using spherical harmonics) and 39 occupied MOs at the Hartree-Fock level. 
If only the Fe 3 d and 4s and ligand o lone pair electrons are correlated there will 
be 9 MOs correlated, if the ligand jt electrons are included there will be 19. We 
assume that 4 million words of central memory are available. For 9 MOs correlated 
there will be 126 J X] and K t3± operators, and using the rrN 4 scheme all could be 
computed in a single pass over the integrals, using (5). If density matrices (6) are 
formed in advance, the storage for operator matrices is halved and two passes over 
the integrals would be required. The final operator matrices would require less than 
one million words of disk space, assuming that C * 2 v symmetry is used. Use of the 
full Dzh symmetry would reduce this even further. If the nN 4 scheme (4) is used, 
one pass each for J and K operators would be required: this would be equivalent 
to recomputing the integrals about six times. Re-sorting of the half- transformed 
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operators could be done in memory. For 19 MOs correlated the number of passes 
for the nN 4 scheme would not change, however, the n 2 N 4 scheme would require 
about six passes over the integrals using (5), or nine using (5) and (6). In either 
case some 4 million words of disk space would be needed for the final operators. For 
the n 2 N 4 case these calculations would all vectorize with, a vector length greater 
than 60, which would be very efficient on machines such as the CRAY 1 or CRAY 
X-MP. 

In each iteration of the direct Cl it is most efficient to generate the contribution 
from the external exchange operators first. For 9 MOs correlated there are 81 
external exchange operators to be computed, these could be generated in two passes 
using (9). For 19 MOs there are 361 operators, these would require five passes. Using 
(12) one pass only would be required for either 9 or 19 MOs correlated, but again 
this is equivalent to computing the integrals four times. The completed exchange 
operators can be used as the first contributions to the vector a, which would be 
of length about 350 000 words for 9 MOs correlated, assuming C 2 V symmetry, or 
3 000 000 for 19 MOsfcorrelated. In the latter case it would be necessary to process 
the Cl coefficients from disk if all of a is to be held in memory. Calculations on this 
scale would hardly be possible using a “conventional” disk-based transformation 
and direct Cl approach. 

It is clear that the overall labour in such a calculation, while substantial, is 
not unreasonably large for a modern supercomputer, or even a large mainframe. It 
is also clear that if the only consideration is to minimize the number of times the 
AO integrals are recomputed there is little to choose between the nN 4 and n 2 N 4 
transformation schemes, at least for calculations of this size. 

VII. Conclusions 

The present work is an attempt to outline some novel prospects for large basis 
set electronic structure calculations that include electron correlation. In general, 
the various approaches suggested are well suited to modern computer architectures 
and share the overall philosophy of avoiding or minimizing the disk-based storage 
and retrieval of integrals. Only certain MO integrals need be stored: no storage 
of AO integrals is required and the method is thus a natural generalization of the 
direct SCF method of Almlof and co-workers. 
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Fig 1. Loop structure for n 2 N 4 operator matrix generation 

Loop on subranges of ij such that all matrices fit in memory 
Loop on shells M, stabilizer is .M 

Loop on shells N (< M), stabilizer is M 

Define R as generators for double cosets MGR V G £ Q 
Loop on elements R of R generating shells RN 
Define U as stabilizer of M.RN 
Loop on shells A (< M), stabilizer is £ 

Loop on shells E (< A, unless A = M, when E < N), stabilizer is S 
Define S as generators for CGS V G E § 

Loop on elements S of S generating shells S E 
Define V as stabilizer of A.S'E 
Define T as generators for UGV V G 6 $ 

Loop on elements T of T generating T(ASE) 

Compute \iu,Ru\T(\So)} V n 6 M, etc 
Accumulate contributions into J l j ts<? 

K 1 ^ / TSo . or whichever skeleton operator 
matrices are being generated in this pass 
End loop on T 
End loop on 5 
End loop on E 
End loop on A 
End loop on R 
End loop on N 
End loop on M 

Symmetrize operator matrices, complete transformation 
and write operators from this subrange to disk 
End loop on subranges of ij 
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Fig 2. Loop structure for nN 4 J operator matrix generation 

Loop on shells M , stabilizer is M 

Loop on shells N (< M), stabilizer is M 

Define R as generators for double cosets MGR V G G £ 

Loop on elements R of R, generating shells RN 
Define U as stabilizer of M.RN 
Loop on shells A, stabilizer is C 
Loop on shells T (< A), stabilizer is 5 
Define S as generators for CGS V G G $ 

Loop on elements 5 of S, generating shells ST, 

Define "V as stabilizer of A .ST 

Define T as generators for UGV V G 6 $ 

Loop on elements T of T, generating T{ A ST) 

Compute [fu.Ru\T(XScr)\ V n 6 M, etc 
stored in memory, indexed by n, v. A, A, T.o, S and T 

I 

End loop on T 
End loop on 5 
End loop on T 
End loop on A 

Form skeleton for each ij, i >.j and /j, G M,u G N 

symmetrize and write to disk 
End loop on R 
End loop on N 
End loop on M 

(Read back, re-sort and transform — loop structure not given) 
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Fig 3. Loop structure for nN 4 K operator matrix generation 


Loop on shells M, stabilizer is M 

Loop on shells N (< M), stabilizer is N 

Define H such that H M V H £ H, are distinct left cosets of M 
Loop on elements H of H, generating shells HN 
Loop on shells A, stabilizer is £ 

Define R as generators for double cosets MG£ V G £ § 

Loop on shells E, stabilizer is 5 

Define S as generators for MGS V G £ £ 

Loop on elements R of R, generating shells RA 
Define U as stabilizer of M.RA 
Loop on elements S of S generating shells SE 
Define V as stabilizer of N.SH 
Define T as generators for UGV V G €. § 

If H e T then 

Compute \fiRA\H[uSc)) V p £ M, etc 
stored in memory, indexed by ix,u,A,A,T.,cr,R,S and H 
Endif 

End loop on S 
End loop on R 
End loop on E 
End loop on A 

Form skeleton A’^^for each ij, i > j and p t M, v € N 
symmetrize and write to disk 
End loop on H 
End loop on N 
End loop on M 

(Read back, re-sort and transform — loop structure not given) 
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Abstract 


Full configuration-interaction calculations are reported, and compared to other 
methods, for H 2 O at its equilibrium geometry and at two geometries with the H- 
0 bonds stretched. Since the percentage of the SCF reference in the FCI wave 
function decreases greatly with the bond elongation, the accuracy of techniques 
based on a single reference do not compare well with the FCI results. However, 
the results from a CASSCF /MRCI treatment are in good agreement with the FCI. 
Correlation effects in F compared to Ne are far more similar than for F“ compared 
to Ne, despite F“ and Ne being isoelectronic. Since the importance of higher than 
double excitations is more important for F - than F, a very high percentage of 
the correlation must be obtained to accurately compute the electron affinity. In a 
CASSCF /MRCI treatment the higher than quadruple excitations contribute 0.02 
eV to the EA, even for modest basis sets. 
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I. Introduction 

We have recently compared several different computational procedures to full 
configuration-interaction (FCI) calculations for Ne atom [l], O atom and its neg- 
ative ion O' (2], HF and NH 2 [3]. Unlike previous FCI calculations [4-5] and 
subsequent tests of methods [6-8], double zeta plus polarization (DZP) or larger 
basis sets were used. Our recent calculations were made possible by recent the- 
oretical [9-10] and technological advances [ll]. The benchmark calculations [1-3] 
showed that the quality of such approximations as the Davidson correction [12] or 
the coupled-pair functional (CPF) method [13] varied with the basis set and with 
the weight of the SCF reference in the Cl expansion. While the dependence of the 
accuracy of such approximations on the weight of the SCF reference is not unex- 
pected, the dependence upon the basis set was a surprise., For example, in Ne atom 
the Davidson correction underestimates the importance of quadruples for basis sets 
without polarization functions, but overestimates their importance by 20% for a 
basis set with two sets of polarization functions. The CPF approach shows the 
opposite effect, improving as the basis set is expanded. ! 

Normally, it is not the absolute accuracy of the methods, but the relative 
accuracy across a potential surface, that is more important. The HF and NH 2 cal- 
culations show that for large geometrical distortions, the SCF reference becomes 
sufficiently poor that the above approximations are in general unreliable. Multi- 
reference singles and doubles Cl (MRSDCI) calculations based upon a complete 
active space SCF (CASSCF) [14] wave function give potentials that far better par- 
allel the FCI results. The inclusion of the multi-reference analog of the David- 
son correction is found to either improve or leave unchanged the accuracy of the 
CASSCF/MRCI treatment. While the calculations on HF and NH 2 lead to consid- 
erable optimism as to the accuracy of the CASSCF /MRCI approach, calculations 
on the electron affinity (EA) of O atom show that even this method has its limi- 
tations; a very large CASSCF /MRCI treatment (308 reference configuration state 
functions (CSF's) in D 2 h) is not able to account for all of the differential higher 
excitation contributions to the EA. 

The previous results have led us to consider H 2 O, F and F“. H 2 O has one 
more electron than NH 2 , thus we are able to see if the accuracy of the different 
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approximations depends upon the number of electrons. The decomposition of the 
correlation by excitation level shows F to be more similar to Ne, while F~ is different 
from either F or Ne. It is this different character of the correlation which leads to 
the problems associated with computing the EA. 

II. Method of calculation 

The O basis set is the Dunning [15] [4s2pj contraction of the Huzinaga [16] 
(9s5p) primitive set augmented with a d polarization function with an exponent 
of 1.2. The H basis set is the [2s] contraction of the (4s) primitive set scaled by 
1.2 [15: and augmented with a set of p (a=0.8) polarization functions. For the F 
and F~ calculations the (9s5p) primitive set is contracted to either [4s2p] following 
Dunning [15], or to [5s3p] by freeing the outermost primitive in the contraction. 
To adequately describe F~, the diffuse p set optimized by Dunning and Hay '[17] is 
added, yielding a final valence basis sets of the form (9s6p)/[4s3p] and (9s6p)/[5s4p]. 
Since the bases sets are given to a different number of significant figures in references 
15 and 17, to avoid confusion we tabulate the basis sets in Table I. A 3d polarization 
function is optimized at the FCI level for F~. The optimal value was found to be 
1.60, the same as that found by Ahlrichs et al. [18] in the optimization for HF 
at the independent electron pair approximation (1EPA). Therefore, when two d 
functions are added, the exponents are taken from Ahlrichs [18], a=4.5 and 1.3. In 
all calculations the 3s components of the 3d orbitals are deleted. 

For H 2 O we consider the equilibrium geometry (r e ), as defined in Table II, and 
two configurations where the H-O-H angle is unchanged and the O-H bonds are 
stretched to 1.5*r e and 2*r e . At these three geometries we consider several different 
levels of treatment. Many correlation treatments are based on a single reference, 
and for these the SCF orbitals are used. In order to reduce the dimension of the FCI 
expansion, the Is electrons are not correlated in any of the calculations. The first, 
level of correlation includes single and double excitations from the SCF reference 
(SDCI). We use both the Davidson correction [12, (denoted +Q) and the CPF [13] 
(an essentially size-consistent reformulation of SDCI) to estimate the importance of 
higher excitations. The importance of higher excitations is also treated via a multi- 
reference SDCI calculation. These MRCI calculations are based upon a CASSCF 
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optimization of the orbitals and include all of the CSF’s in the CASSCF as refer- 
ences. Two different CASSCF calculations are performed. The first correlates the 
two O-H bonds; the four bonding electrons are distributed within the two active 
ai and two active b 2 orbitals. Although this CASSCF gives proper dissociation, 
the MRCI calculation shows important configurations (coefficient greater than 0.05 
in the MRCI wave function) not included in the reference space. These additional 
CSF’s involve excitations out of the bj lone pair of oxygen. To account for this 
additional important correlation effect, the two bi electrons and two bi orbitals 
are added to the CASSCF active space. The MRCI(BIG) calculation based upon 
the CASSCF(BIG) orbitals does not show any additional important CSF’s. The 
importance of the additional CSF’s associated with the bi lone pair decreases as 
the bond length is increased. As the H atoms donate charge to the oxygen, this 
additional correlation reflects some 0” character near r e which vanishes as H 2 O 
dissociates. The MRCI(BIG) calculations contain only 31096 CSF’s, as compared 
to the FCI calculations which contain 6 740 280 CSF’s, expanded into 28 233 466 
determinants and 113 million intermediate states [9,10] in the Knowles and Handy 
FCI procedure. 

The calculations for F and F~ proceed along the same lines as for H 2 O. The Is 
electrons are not correlated in any of the calculations. The CASSCF wave functions 
have the 2p electrons and the 2p and 2p' orbitals as active. In addition to the 
calculations performed for H 2 O, two additional single-reference procedures are used. 
The first includes all single, double and triple excitations (SDT), while the second 
includes in addition the quadruple excitations (SDTQ). Fot the largest basis set, 
the SDTQ calculation leads to a Cl expansion of 110679 CSF’s, which is at about 
the limit of our conventional Cl program. This is far larger than the 19996 CSF’s 
in the MRCI expansion, but far smaller than the 6 574 356 CSF’s (27 944 852 
determinants and 224 million intermediate states) in the FCI wave function. 

III. Results and discussion 

The total energies of the H 2 O calculations are summarized in Table II, while 
the correlation contributions are decomposed in Table III. The correlation energy, 
relative to SCF, varies rapidly with R(O-H), increasing by a factor of 1.71 between r e 
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and 2*r e . The single and doubles correlation energy shows a much smaller change, 
increasing by a factor of only 1.45. Thus the error in the SDCI calculation is 
quite large. (The difference between the FCI potential and those at other levels is 
illustrated by shifting the potential curves to bring them into agreement with the 
FCI potential at r £ , see Table IV). The smaller increase in the correlation energy 
with r for the SDCI relative to the FCI shows the differential importance of the 
higher than double excitations with increasing r. The Davidson correction applied 
to the SDCI and the CPF method both show the correct trend of increasing rapidly 
with distance. However, the Davidson correction is too small everywhere, with the 
error increasing w-ith increasing r. The CPF estimate is also too small at r e , but 
becomes too large at 2*r e . Thus, at each point, the CPF has about the same error 
as the Davidson correction, but since the error changes sign, the error in the CPF 
between 1.5*r e and 2*r e is larger. 

The CASSCF treatments, when compared to the SCF, show an even larger 
change in correlation energy with R(O-H) than does the FCI. This is to be expected, 
since the CASSCF correctly dissociates to ground state atoms while the SCF does 
not. Since there is more correlation in the molecule than in the atoms, when 
compared to the FCI, the CASSCF’s show a difference with the FCI with R(O-H) 
which in the opposite direction from the SCF. However, the shape of the potential in 
the CASSCF calculations is in better agreement with the FCI than either the SCF 
or SDCI calculations. The inclusion of more extensive correlation reduces the error 
further, but the differential correlation effect is much smaller than that at other 
levels (for example E(MRCI)-E(CASSCF) changes by only a. factor 1.26, which 
reduces to 1.12 with the larger CASSCF reference). The inclusion of the multi- 
reference analog of the Davidson correction leads to an energy lower than that at 
the FCI level. This overshoot for the MRCI-i-Q calculations was also found for NH 2 
[3]. The inclusion of this correction reduces the error in the calculation, but the 
error is actually reduced further for the smaller reference space. Thus the increase 
in the number of references improves the MRCI results, but the MRCI-j-Q results 
do not show the same monotonic improvements with the number of references; this 
is also true for NH 2 [3]. However, the errors in either of the MRCI+Q calculations 
are acceptable, and much smaller than the single reference based approaches. 
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The ratio of the total correlation energy for the *Ai state of H 2 O to the 2 Bi 
state of NH 2 decreases from 1.27 at r e to 1.17 at 2*r e . This is to be expected, 
considering that H 2 O has one more electron than NH 2 . However, in spite of the 
increase in the total correlation energy, the accuracy of the MRCI and MRCI+Q 
potentials is very similar for H 2 0 and NH 2 , that is, errors of no more than 1.2 
kcal/mole in the potentials relative to r e ; in fact H 2 O has a slightly smaller relative' 
error. Thus the accuracy of the MRCI approach does not appear to depend on the 
number of electrons correlated for systems of this size. 

The correlation contributions for F and F - are decomposed in Table V, and 
the results for the EA is summarized in Table VI. The previous Ne atom results are 
also summarized in Table V for comparison. The total correlation energy of F“ is 
1.09 times larger than for Ne, even though they have the same number of electrons. 
For comparison F has only 83% of the correlation energy of Ne. The difference 
in correlation energy between Ne and F _ arises from the increased (by about a 

factor of two) importance of the triple (measured as E(SDT)-E(SD)), quadruple, 

> 

and higher than quadruple excitations. This is quite different from F, for which 
the higher than double excitations contribution is 85% of that for Ne, that is, the 
relative importance of the single and doubles and the higher excitations is about the 
same for F and Ne. The greater importance of the higher excitations for F~ than F 
makes the determination of the EA, which depends on obtaining all the differential 
correlation energy, a difficult task, as compared, say to a potential curve where only 
relative accuracy is needed. The importance of higher than double excitations is 
illustrated in Table VI: for the smallest basis set used ([4s3pld]) the SCF EA is in 
error by 2 eV, which is reduced by 1.37 eV with the inclusion of SD correlation, 
but the FCI EA is still larger by 0.21 eV. Higher excitations comprise about 13% 
of the correlation contribution to the EA. If the basis set is improved to [5s4p2d] , 
the contribution of the higher than double excitation increases to 15% of the total 
correlation. If the Davidson correction or CPF approach is used to account for the 
higher excitations the EA is improved over the SDCI, but is still not equal to the 
FCI result. These corrections underestimate the importance of higher excitations 
for both systems. It is well known [19] that the most important correlation effect 
for F and F - is the 2p to 2p' excitation. When this is included in the CASSCF 
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calculation, the EA is considerably improved over the SCF result, giving about 80% 
of the difference between SCF and SDCI. Using this CASSCF reference leads to a. 
MRCI EA which is better than either the CPF or SDCI+Q treatments, and in 
good agreement with the FCI, being only 0.02eV smaller. If the estimate of higher 
excitations is included, an energy lower than the FCI result is obtained for both 
F and F“. However, this correction may overestimate the higher excitations in ah 
equivalent manner for both systems, since the results at this level are equal to those 
at the FCI level. 

At the FCI level, the 2s correlation was found to contribute significantly to the 
EA of oxygen [2], In Table VI, we also report the EA when only the 2p electrons 
are correlated. While correlating only the 2p electrons increases the SDCI EA by 
0.13 to 0.18 eV relative to correlating both the 2s and 2p electrons, at the FCJ level 
the EA is increased by only 0.08 to 0.12 eV. The negative contribution to the EA 
of the 2s-2s and 2s-2p correlation decreases with the inclusion of higher excitations. 
For O/O - , with a very large basis set the 2s contribution actually increases the 
EA, but only when higher excitations are included. This is understandable in light 
of the factor of two larger contribution from higher excitations in the negative ions. 

IV. Conclusions 

The MRCI potentials (and MRCI with the multi-reference analog of the David- 
son correction) are found to be in excellent agreement with FCI calculations. The 
error in the H 2 O calculations are very similar to that found for NH 2 , even though 
the total correlation energy of H 2 O is about 1.2 times larger. The contribution of 
higher than double and of higher than quadruple excitations is found to be a factor 
of two larger for F - than Ne, whereas the single and doubles correlation energy 
differs by only 10%. For F, the single and doubles, and higher than doubles, are the 
same percentage of the correlation as in Ne. Since the distribution of the correlation 
energy by excitation level is different between F and F“, all of the correlation must 
be computed to account for the difference in order to obtain accurate EA . 
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Table I. The valence basis sets. 


s p 

O (9s5p)/[4s2p] 


exponent 

coefficient 

exponent 

coefficient 

7817.0 

0.002031 

35.18 

0.019580 

1176.0 

0.015436 

7.904 

0.124200 

273.2 

0.073771 

2.305 

0.394714 

81.17 

0.247606 

0.7171 

0.627375 

27.18 

0.611832 

0.2137 

1.000000 

3.414 

0.241205 



9.532 

1.000000 



0.9398 

1.000000 



0.2846 

1.000000 




H (4s)/[2s) 



exponent 

coefficient 



19.2384 

0.032828 



2.89872 

0.231204 



0.653472 

0.817226 



0.177552 

1.000000 

F (9s5p)/[4s2p] 


exponent 

coefficient 

exponent 

coefficient 

9994.79 

0.002017 

44.3555 

0.020868 

1506.03 

0.015295 

10.0820 

0.130092 

350.269 

0.073110 

2.9959 

0.396219 

104.053 

0.246420 

0.9383 

0.620368 

34.8432 

0.612593 

0.2733 

1.000000 

4.3688 

0.242489 



12.2164 

1.000000 



1.2078 

1.000000 



0.3634 

1.000000 
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Table II. Total energies (a.u.) for the H 2 O calculations. 


Calculation 


SCF 

* e 

-76.040542 

SDCI 

-76.243772 

FCI 

-76.256624 

CPF 

-76.252504 

SDCI+Q 

-76.254549 

CAS 

-76.094713 

MRCI 

-76.251643 

MRCI-Q 

-76.257983 

CAS(BIG) 

-76.129876 

MRCI(BIG) 

-76.254108 

MRCI(BIG) + Q 

-76.257805 


geometry" 


1.5*r e 

2*r e 

-75.800494 

-75.582286 

-76.040984 

-75.876606 

-76.071405 

-75.952269 

-76.064365 

-75.956222 

-76.067003 

-75.942257 

-75.924781 

-75.823721 

-76.066885 

-75.948557 

-76.072741 

-75.952973 

-75.953141 

-75.839916 

-76.069363 

-75.950517 

-76.072943 

-75.953731 ' 


a The 0 is located at (0,0,0) and the H nuclear coordinates are: r e (±1.494187, 0, 
1.156923), 1.5*r e (±2.241281, 0, 1.735385), and 2*r e (±2.988374, 0, 2.313846). 
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Table III. A decomposition of the correlation contributions for water, in a.u. 


Calculation 

E(SDCI)-E(SCF) 

E(FCI)-E(SCF) 

E(FCI)-E(SDCI) 

E(CPF)-E(SDCI) 

E(SDCI+Q)-E(SDCI) 

E(CAS)-E(SCF) 

E(MRCI)-E(CAS) 

E(MRCI)-E(SDCI) 

E(FCI)-E(MRCI) 

E(MRCI-fQ)-E(MRCI) 

E(FCI)-E(MR’CI+Q) 

E(CAS(BIG))-E(SCF) 

E(MRCI(BIG))-E(CAS(BIG)) 

E(MRCI(BIG))-E(SDCI) 

E(FCI)-E(MRCI(BIG)) 

E(MRCI(BIG)+Q)-E(MRCI(BIG)) 

E(FCI)-E(MRCI(BIG)— Q) 


r e 

geometry 

1.5*r e 

2*r e 

0.203230 

0.240490 

0.294320 

0.216082 

0.270911 

0.369983 

0.012852 

0.030421 

0.075663 

0.008732 

0.023381 

0.079616 

0.010777 

0.026019 

0.065651 

0.054171 

0.124287 

0.241435 

0.156930 

0.142104 

0.124836 

0.007871 

0.025901 

0.071951 

0.004981 

0.004520 

0.003712 

0.006340 

0.005856 

0.004416 

-0.001359 

-0.001336 

-0.000704 

0.089334 

0.152647 

0.257630 

0.124232 

0.116222 

0.110601 

0.010336 

0.028379 

0.073911 

0.002516 

0.002042 

0.001752 

0.003697 

0.003580 

0.003214 

-0.001181 

-0.001538 

-0.001462 
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Table IV. A comparison of the potential curves for water, in a.u. All curves are 
shifted in energy to bring the energies at r e into agreement with that at the full Cl 
level. The difference energy between r e and the other geometries is compared to' 
the FCI potential. 


Calculation 

1.5*r e -r e 

2*r e -r e 


SCF 

-0.05482900 

-0.15390100 


SDCI 

-0.01756900 

-0.06281100 


CPF 

-0.00292000 

0.00807300 


SDCI+Q 

-0.00232700 

-0.00793700 


CAS 

0.01528700 

0.03336300 


MRCI 

0.00046100 

0.00126900 


MRCI+Q 

-0.00002300 

-0.00065500 


CAS(BIG) 

0.00848400 

0.01439500 


MRCI(BIG) 

0.00047400 

0.00076400 


MRCI(BIG) + Q 

0.00035700 

0.00028100 



13 



I> 111- 


Table V. A comparison of the correlation contributions, in a.u, for F , F and Ne. 



F ion 



Basis 

[4s3p] [4s3pld] 

[4s3p2d] 

[5s4p2d] 

E(SCF) 

-99.442848 -99.442848 

-99.442848 

-99.443696 

E(SD)-E(SCF) 

0.132219 0.197820 

0.220160 

0.245405 

E(SDT)-E(SD) 

0.001913 0.003241 

0.004297 

0.006369 

E(SDTQ)-E(SDT) 

0.006558 0.008848 

0.009707 

0.010480 

E(FCI)-E(SDTQ) 

0.000486 0.000584 

0.000664 

0.000740 

E(FCI)-E(SD) 

0.008957 0.012673 

0.014668 

0.017589 

E(SD-f-Q)-E(SD) 

0.006106 0.009711 


0.012071 

E(CPF)-E(SD) 

0.005043 0.008155 


0.010077 

E(CASSCF)-E(SCFj 



0.107265 

E(MRCI)-E(CASSCF) 



0.152776 

E(MRCI)-E(SD) 



0.014636 

E(MRCI+Q)-E(SD) 

F atom 


0.018473 

Basis 

[4s3pld] 

[4s3p2dj 

[5s4p2dj 

E(SCF) 

-99.394273 

-99.394273 

-99.394684 

E(SD)-E(SCF) 

0.147416 

0.165916 

0.192421 . 

E(FCI)-E(SD) 

0.004931 

0.006294 

0.007772 

E(CPF)-E(SD) 



0.004741 

E(SD+Q)-E(SD) 



0.006344 

E(FCI)-E(SCF) 

0.152347 

0.172210 

0.200193 

E(CASSCF)-E(SCF) 



0.061620 

E(MRCI)-E(SD) 



0.005684 

E(MRC1+Q)-E(SD) 

Ne atom 0 


0.008689 

Basis 



[5s3p2dj 

E(SD)-E(SCF) 



0.235733 

E(SDT)-E(SD) 



0.003258 

E(SDTQ)-E(SDT) 

- 


0.005670 

E(FCI)-E(SD) 



0.009131 

E(FCI)-E(SDTQ) 



0.000203 

E(CPF)-E(SD) 



0.005276 

E(SD+Q)-E(SD) 



0.006823 


0 Results are taken from Reference 1. 
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Table VI. A comparison of the computed EA a for F, in eV. 


2s and 2p correlated 


Basis 

[4s3pld] 

[4s3p2d] 

[5s4p2d] 

SCF 

1.32 

1.32 

1.33 

SDCI 

2.69 

2.79 

2.78 

FCI 

2.90 

3.03 

3.04 

DVD 



2.93 

CPF 



2.92 

CAS 



2.58 

MRCI 



3.02 

MRCI-fQ 


2p correlated 

3.04 

Basis 

[4s3pld! 

[4s3p2d] 

(5s4p2d) 

SDCI 

2.82 

2.97 

2.95 

FCI 

2.98 

3.15 

3.16 


“ For comparison the experimental value is 3.399 eV, see Reference 20. 
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Abstract 

Full configuration-interaction (FCI) calculations are performed at selected ge- 
ometries for the 1 T.' t state of HF and the 2 Bi and 2 Ai states of NH 2 using both DZ 
and DZP gaussian basis sets. Higher excitations become more important when the 
bonds are stretched and the SCF reference becomes a poorer zeroth-order descrip- 
tion of the wave function. The CASSCF-MRCI procedure gives excellent agreement 
with the FCI potentials, especially when corrected with a multi-reference analog of 
the Davidson correction. 
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I. Introduction 

Recent improvements in method? for full configuration-interaction (FCI) cal- 
culations [1-2] combined with the extensive memory (>256 million words) and ex- 
cellent vector capabilites of the CRAY 2, permit FCI calculation with larger basis 
sets than used in previous benchmark calculations [3-4]. Recently, we presented 
FCI calculations for the J S state of Ne atom [5] to assess the reliability of methods 
such as the Davidson correction [6] and the coupled pair functional (CPF) [7] for 
estimating the energy contribution of higher excitations. An important observation 
was that the accuracy of both the Davidson correction and the CPF approximation 
depended on basis set quality. For example, the CPF accounted for only 40% of 
the quadruples contribution for a DZ basis set, but 60% of the quadruples contri- 
bution for a DZP basis set. However, the total contribution of higher excitations 
was relatively small in Ne, which is well described by an SCF reference. To investi- 
gate further the accuracy of approximate methods of including higher excitations, 
we consider herein the 1 £ + state of the isoelectronic HF molecule and the 2 Bi and 
2 Aj states of NH 2 using both DZ and DZP gaussian basis sets. To investigate struc- 
tures where the SCF is not a good zeroth-order description we consider geometries 
aw’ay from equilibrium. 

II. Methods 

For the nitrogen and fluorine atoms we used the Dunning 4s2p contraction [8] of 
the Huzinaga 9s5p primitive basis sets [9]. For hydrogen we used the 2s contraction 
[8] of the Huzinaga 4s primitive set scaled by a factor of 1.2. When polarization 
functions are included, the exponents are: F (3d =1.6), N(3d=0.9), and H(2p=0.8). 
The 3s component of the 3d functions is deleted in all calculations. 

For HF the geometries considered are r e (1.733 bohr), 1.5 times r« (2.5995 
bohr), and twice r e (3.466 bohr). For NH 2 we consider r e , 1.5 times r e and twice r e , 
as well as a fourth point with the H-H bond distance at the H 2 equilibrium value 
and the N-H distance at about twice the r e for NH 2 . The NH 2 molecule is placed 
in the xz plane, with the N at the origin. The coordinates actually used for NH 2 
are given explicitly in Table I. 

In this study we have used both an SCF and a complete-active-space self- 
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consistent field (CASSCF) wave function [10] as the zeroth-order reference. The 
SCF reference is used for the single-reference singles plus doubles configuration- 
interaction calculation, SDC1, SDCI -j- triples (SDT), SDT + quadruples (SDTQ), 
the coupled pair functional (CPF) wave function and the Chong-Langhoff modifica- 
tion [ll] of CPF (MCPF). The SCF reference is also used for the FCI calculations, 
which are found to be invariant to the orbital basis to within a few microhartrees. 
The slight differences arise because the two core electrons on nitrogen and fluorine 
are not correlated in any calculations since this restriction dramatically reduces the 
length of the FCI expansion. For the 1 E + state of HF the SCF reference config- 
uration is lo 2 2o 2 3o 2 l7r 4 , and for the 2 Bi state of NH 2 it is Ia 2 2a 2 3a 2 l6jl6 2 at 
all geometries. For the 2 Ai state of NH 2 the three geometries stretching the two 
N-H bonds correspond to the 3a x -+lbi excitation relative to the 2 Bj configuration 
whereas the fourth point denoted N- • -H-H corresponds to the lb 1 — +4aj excitation. 

The multi-reference Cl calculations (MRCI) are based on CASSCF wave func- 
tions. For HF, the hydrogen Is and fluorine 2po orbitals and electrons are active. 
The MRCI calculations consist of single and doubles from the two non-vanishing 
configurations in the CASSCF wave function. For both states of NH 2 , the nitrogen 
2s and 2p orbitals and electrons are active, as well as the two hydrogen Is orbitals 
and electrons. The first set of MRCI calculations using these CASSCF optimized 
orbitals include all references arising from all distributions of the nitrogen 2p and 
hydrogen Is electrons among the active orbitals; hence the 2s electrons are cor- 
related, but the 2s orbital is doubly occupied in all ..reference configurations. In 
the second set of MRCI calculations, denoted MRCI(BIG), all configurations in 
the CASSCF are included as references. For the SDCI wave functions we also in- 
clude the Davidson estimate for unlinked quadruple excitations, denoted +Q. For 
the MRCI calculations we use a multi-reference analogue of this correction, namely 
AEsp where Asd is the difference between the energy of the reference 

CSF’s and the MRCI, and the Cr are the coefficients of the reference configurations 
in the MRCI wave function. 

HI. Results and discussion. 

The total energies at the SCF and FCI level are summarized in Table I for 
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both HF and NH 2 . The molecular geometries used for the 2 Bi and 2 Ai states of 
NH 2 are given explicitly as well. 

In Table II we have summarized the Cl results for HF using both the DZ and 
DZP basis sets at three geometries (r e ,1.5*r e ,2*r e ). It is interesting that although 
the SDCI-SCF energy difference is considerably larger for the DZP basis, this dif- 
ference increases more slowly with increasing R than for the DZ basis. The ratio of 
this difference at 2*r e compared to r e is 1.21 with the DZP basis and 1.47 with the 
DZ basis. Hence, the addition of the polarization function substantially improves 
the description of the distortions taking place as the bond is broken, and less of 
this effect shows up as electronic correlation. For the DZP basis the energy contri- 
bution of the triples, quadruples and higher than quadruple excitations all increase 
at about the same rate as the bond is broken (by about a factor of three between 
2*r e and r e ). The energy contribution of quadruple excitations at 2*r e using the 
DZP basis is about 0.5 eV, which is about 40 times greater than the combined 
contribution of quintuple through octuple excitations. 

The results in Table II show that the three configuration CASSCF calculation 
followed by all single and double excitations from the two configurations (a 2 and 
cr* 2 ) that have non-vanishing coefficients in the CASSCF, provide a much more 
uniform description of the potential. Also, the multi-reference quadruples correction 
is much more uniform as a function of bond distance. 

The next three rows for each basis set in Table II give a measure of the re- 
liability of CPF methods and the Davidson correction for estimating the energy 
contribution of higher excitations. Note that at r e these corrections all underesti- 
mate the quadruples correction, but as the bond length is increased the corrections 
become a substantial overestimate. In fact the SDCI+Q energies at 2*r e are well 
below 7 the FCI energies. Note also that this overcorrection of SDCI+Q is much less 
severe for the DZP basis than the DZ basis. 

Since it is a rather stringent requirement of any method to reproduce the FCI 
total energies, a better criterion for judging a method is how well the resulting 
potentials parallel the FCI potential. In Table HI we report for HF the energy 
difference between r e and 1.5*r e and 2*r e at different levels of theory. That is, 
all potentials are normalized at r e so that the energy differences in Table III reflect 
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directly deviations with the FCI potential. The SCF description becomes quite poor 
as the bond is stretched, although somewhat less so for the DZP basis. The CASSCF 
description is better, but overcorrects because it overestimates the contribution' 
of the dissociative configuration. The SDCI is a substantial improvement over 
SCF, but still retains some of the bias of the SCF. The SDCI results are improved 
by the Davidson correction, especially for the DZP basis, but overestimates the 
effect of higher excitations. The coupled pair methods are generally more reliable 
than SDCI-i-Q, and the MCPF results for the DZP basis are in particularly good 
agreement with the FCI results. Note that the results at the SDT level are still 
inadequate since the energy contribution of quadruple excitations is both large and 
rapidly increasing as the bond is broken. At the SDTQ level the error at 2*r e in 
the DZP basis is less than 0.02 eV. However, the SDTQ configuration expansions 
are quite lengthy (48, 963 CSFs for the DZP basis), and hence do not represent 
an optimal approach of including higher excitations. This is illustrated by the 
results of the much smaller MRCI expansions (1015 CSFs), which are of comparable 
quality. Most impressive, however, are the MRCI+Q results which agree with the 
FCI potential to well within chemical accuracy in every case. The comparison of 
the MRCI and MRCI+Q results in Table III provide strong support for the validity 
of the multi-reference analog of the Davidson correction. 

In addition to the dissociation of HF, where one chemical bond is being broken, 
we consider for the 2 Bj and 2 Ai states of NH 2 the simultaneous extension of both 
N-H bonds. The energy difference between the FCI and various levels of theory 
using both the DZ and DZP gaussian basis sets are summarized for the 2 Bi and 
2 Aj states in Tables IV and V, respectively. Four geometries are considered - 
equilibrium, both bonds stretched to 1.5 and 2.0 times r e , and an N* ■ -H-H structure 
with the H-H bond length that of the ground state of H 2 and the N-H bond at about 
2*r e . Explicit coordinates are given in Table I. As for the HF molecule, the SCF 
reference becomes an increasingly poorer zeroth-order description of the system as 
the bond length is increased, particularly for the 2 Bi ground state. Although the 
SDCI accounts for a substantial portion of this difference, the difference with the 
FCI and hence the contribution of higher excitations increases rapidly as the bonds 
are stretched. In contrast, the difference between the FCI and CASSCF is more 


5 



•II 111- 


constant and actually decreases slightly with increasing r; hence the errors in the 
MRCI treatment are generally less at 2*r e than at r e . In general, the differences 
with the FCI are further reduced wheft the multi-reference quadruples correction is 
added, although in every case MRCI+Q is below the FCI energy. The coupled pair 
functional methods and the SDCI+Q, which are based on the SCF reference, have 
larger differences with the FCI, and these differences increase as the SCF reference 
becomes a poorer representation of the wave function. These approximate methods 
for incorporating higher excitations are substantially closer to the FCI energies 
than are the SDCI energies. Generally they give energies that lie above the FCI for 
the T e and 1.5*r e geometries, but often overshoot (particularly CPF) the energy at 
2*r e . The MCPF method, which uses somewhat more complex but more realistic 
renormalization denominators, tends to overshoot less and thus has a larger domain 
of applicability. 

The theoretical potentials at various levels of theory are compared to the FCI 
potentials for the 2 Bi and 2 Ai states in Tables VI and VII, respectively. These 
results again illustrate how poor the SCF potential becomes as r increases. The 
CASSCF overestimates the importance of the dissociative configurations and errs in 
the opposite direction, although it is better than the SDCI potential, which retains 
much of the bias of the SCF. However, the Davidson correction helps substantially 
and the SDCI+Q potential is approaching chemical accuracy. The MRCI potentials 
are substantially better. Again, the multi-reference Davidson correction generally 
gives further improvements in the potentials. 

The energy between the minimum in the 2 Bi and 2 Aj potentials of NH 2 (T e ) 
is given with respect to the FCI result at each level of correlation treatment for 
the DZ and DZP basis sets in Table VIII. Since the SCF reference provides nearly 
equivalent descriptions of both states, the differences with the FCI results are not 
very large. Apart from the SCF and CASSCF results, the T e are within 0.05 eV 
of the FCI result. Interestingly the multi-reference Davidson correction actually 
makes the agreement worse, although the errors are in every case small. 

IV. Conclusions 

The CASSCF MRCI calculations are in excellent agreement with the FCI cal- 
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culations, especially after including a correction for quadruple excitations. This is 
not surprising considering that the CASSCF potential parallels the FCI potential 
better than does the SDCI potential. The inclusion of an estimate of higher excita- 
tions, either by the Davidson correction or by CPF works reasonably well, except 
for NH 2 at 2*r e , where the SCF reference is much poorer. The MCPF method 
gives an improved description of the 2*r e point, but does not significantly alter the 
results at the other points, where the SCF is a better reference. 

The accuracy of the different approximations are found to vary somewhat with 
the quality of the basis set used. These results should supply a better test of 
methods than the previous FCI calculations, most of which were restricted to a DZ 
basis set. 
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Table I. Total energies (a.u.) for the full CI(SCF) calculations. 



DZ 

t 

DZP 

geometry® 

r e 

-100.147204 (-100.021973) 

-100.250969(-100.047087) 

1.733 

1.5*r e 

-100.079441 (-99.924625) 

-100. 160393 (-99.933229) 

2.5995 

2*r e 

-100.008676(-99. 815206) 

-100.081108(-99. 817572) 

3.466 



NH 2 2 Bj 



DZ 

DZP 

geometry 6 (x,z) 

r e 

-55.646028 (-55.543825) 

-55.742620(-55.573008) 

1.5186,1.1993 

1.5*r e 

-55. 534809(-55. 373780) 

-55.605209(-55.387413) 

2.2779,1.79895 

2*r e 

-55.449427(-55.185112) 

-55. 505524(-55. 188719) 

3.0372,2.3986 

N+Hj 

-55.472746(-55.38314l) 

-55.544560(-55.388944) 

NH a 2 Ai 

0.7006,3.8062 


DZ 

DZP 

geo(x,z) 

r e 

-55.603404 (-55.505424) 

-55.688762(-55.523192) 

1J972, 0.5840 

1.5*r e 

-55 449846(-55.311550) 

-55. 5l7614(-55. 32145) 

2.6958,0.8760 

2*r e 

-55.355766(-55.155112) 

-55. 415133(-55. 157046) 

3.5944,1.1680 

n+h 2 

fit Tl. U 

-55.4621 19(-55.364954) 

t r T "h 1 

-55.53608l(-55.370425) 

0.7006,3.8062 


“ The H-F bond length in bohr. 

6 The x,z corridinates, where the molecule is placed in the xz plane with the N at 
0,0,0, and the H atoms at x,0,z, and -x,0,z. 
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Table II. Energy differences (au) between different levels of correlation treatment 
for the 1 E + state of HF. 


A. DZ BASIS 

r e 

1.5*r e 

2*r e 

SDCI-SCF 

SDT-SDCI 

SDTQ-SDT 

FC1-SDTQ 

-0.11951300 

-0.00106500 

-0.00444400 

-0.00020900 

-0.14499600 

-0.00189500 

-0.00756900 

-0.00035600 

-0.17531200 

-0.00491100 

-0.01261700 

-0.00063000 

MRCI-CASSCF 

MRCI-rQ-MRCI 

-0.09672100 

-0.00251900 

-0.09518000 

-0.00273100 

-0.08502900 

-0.00228300 

CPF-SDCI 

SDC1+Q-SDCI 

MCPF-SCCI 

-0.00302000 
-0.00391000 
. -0.00320500 

-0.00637900 

-0.00914200 

-0.00712900 

-0.01430100 

-0.02510300 

-0.01713700 

B. DZP BASIS 




SDCI-SCF 

SDT-SDCI 

SDTQ-SDT 

FCI-SDTQ 

-0.19450300 

-0.00236800 

-0.00672900 

-0.00028200 

-0.21229400 

-0.00375300 

-0.01062300 

-0.00049400 

-0.23596100 

-0.00842200 

-0.01823500 

-0.00091800 

MRCI-CASSCF 

MRCI+Q-MRCI 

-0.17409400 

-0.00607600 

-0.16719100 

-0.00615600 

-0.15418300 

-0.00528000 

CPF-SDCI 

SDCI+Q-SDCI 

MCPF-SDCI 

-0.00613000 

-0.00778300 

-0.00640100 

-0.01063900 

-0.01345900 

-0.01139400 

-0.02227100 

-0.02886600 

-0.02466700 
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Table III. Energy differences (au) between the FC1 and different levels of correlation 
treatment for the 1 E + state of HF. 


Method 

DZ Basis 
1.5*r e -r e 

2*r e -r e 


SCF 

0.029585 

0.068239 


SDCI 

0.004102 

0.012440 


SDCI+Q 

-0.001130 

-0.008753 


CPF 

0.000743 

0.001159 


MCPF 

0.000178 

-0.001492 


SDT 

0.003272 

0.008594 


SDTQ 

0.000147 

0.000421 


CASSCF 

-0.001289 

-0.011865 


MRCI 

, 0.000252 

-0.000173 


MRCI+Q 

0.000040 

0.000063 


SCF 

DZP basis set 
0.023282 

0.059654 


SDCI 

0.005491 

0.018196 


SDCI+Q 

-0.000185 

-0.002887 


CPF 

0.000982 

0.002055 


MCPF 

0.000498 

-0.000070 


SDT 

0.004106 

0.012142 


SDTQ 

0.000212 

0.000636 


CASSCF 

-0.006811 

-0.020667 


MRCI 

0.000092 

-0.000756 


MRCI-i-Q 

0.000012 

0.000040 
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Table IV. Energy differences (au) between the FCI and other levels of correlation 
treatment for the 2 B\ state of NH 2. 0 


DZ Basis 


Method 

Te 

1.5*r e 

2*r e 

N- • -H 2 

SCF 

0.102203 

0.161029 

0.264315 

0.08960518 

SDCI 

0.004609 

0.016439 

0.055109 

0.00621524 

MCPF 

0.001403 

0.002836 

0.009711 

0.00032756 

CpF'fc 

0.001489 

0.002595 

-0.005823 

0.00082237 

CPF 

0.001460 

0.001868 

-0.023677 

0.00078711 

SDCI+Q 

0.000447 

-0.000890 

-0.004487 

0.00075817 

CASSCF 

0.051976 

0.045721 

0.039039 

0.04644218 

MRCI 

0.001172 

0.000714 

0.000542 

0.00114810 

MRCI(BIG) 

0.001116 

0.000644 

0.000509 

0.00098085 

MRCI+Q 

-0.000154 

-0.000492 

-0.000264 

-0.00029528 

MRCI(BIG)+Q 

-0.000055 

-0.000355 

^0.000219 

-0.00007293 


DZP Basis 


SCF 

0.169612 

0.217796 

0.316805 

0.15561649 

SDCI 

0.009003 

0.023472 

0.069157 

0.01329291 

MCPF 

0.002365 

0.004967 

0.015670 

0.00200373 

CpF'fc 

0.002509 

0.004707 

0.003116 

0.00178015 

CPF 

0.002480 

0.004190 

-0.009212 

0.00169289 

SDCI+Q 

0.000572 

0.001584 

0.009026 

0.00244093 

CASSCF 

0.121869 

0.107084 

0.094456 

0.11400831 

MRCI 

0.003446 

0.002279 

0.001501 

0.00337559 

MRCI(BIG) 

0.003202 

0.001940 

0.001338 

0.00292420 

MRCI+Q 

-0.001271 

-0.002047 

-0.001735 

-0.00162566 

MRCI(BIG)+Q 

-0.001239};* 

-0.001980 

-0.001741 

-0.00146699 


0 Negative entry indicates the energy is lower than the FCI. 

6 The Chong-Langhoff implemention of CPF [11], which for open shell systems 
differs from that of Ahlrichs et al. [7]. 
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Table V. Energy differences (au) between the FCI and other levels of correlation 
treatment for the 2 Aj state of NHj.® 


DZ Basis 


Method 

r e 

1.5*r e 

2*r e 

n--*h 2 

SCF 

0.097980 

0.138296 

0.200654 

0.09716511 

SDCI 

0.004336 

0.012032 

0.032600 

0.01312506 

MCPF 

0.001456 

0.003365 

-0.000088 

0.00118610 

CPF' fr 

0.001532 

0.003347 

-0.018390 

0.00297603 

CPF 

0.001519 

0.003375 

-0.014174 

0.00289667 

SDCI+Q 

0.000616 

0.000893 

-0.004761 

0.00542632 

CASSCF 

0.058332 

0.058208 

0.043838 

0.04449573 

MRCI 

0.001251 

0.001572 

0.000811 

0.00090852 

MRCI(BIG) 

0.001009 

0.001114' 

, 0.000735 

0.00087816 

MRCI+Q 

-0.001631 

-0.002968 

-0.000326 

-0.00005773 

MRCI(BIG)+Q 

-0.000516 

-0.000610 

-0.000238 

-0.00007346 



DZP Basis 



SCF 

0.165570 

0.196167 

0.258087 

0.16565612 

SDCI 

0.008482 

0.018097 

0.048673 

0.02229559 

MCPF 

0.002290 

0.004900 

0.005865 

0.00550461 

CPF' fc 

0.002431 

0.004970 

-0.015832 

0.00424528 

CPF 

0.002413 

0.005022 

-0.016182 

0.00412679 

SDCI+Q 

0.000618 

0.002403 

0.006886 

0.00922251 

CASSCF 

0.127696 

0.118050 

0.102355 

0.11461881 

MRCI 

0.003929 

0.003935 

0.002267 

0.00316584 

MRCI(BIG) 

0.003228 

0.002836 

0.001803 

0.00278544 

MRCI+Q 

-0.003106 

-0.005010 

-0.001670 

-0.00117540 

MRCI(BIG)+Q 

. -0.001809 

-0.002219 

-0.001918 

-0.00157857 


tt Negative entry indicates the energy is lower than the FCI. 
b The Chong-Langhoff implemention of CPF [11], which for open shell systems 
differs from that of Ahlrichs et al. [7]. 
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Table VI. Energy differences (au) between the FCI and different levels of correlation 
treatment for the 2 Bi state of NH 2 . 


DZ Basis 


Method 

1.5*r e -r e 

2*r e -r e 

N---H 2 -r e 

SCF 

0.05882552 

0.16211179 

-0.01259813 

SDCI 

0.01183033 

0.05050009 

0.00160673 

MCPF 

0.00143290 

0.00830835 

-0.00107530 

CPF'° 

0.00110573 

-0.00731251 

-0.00066686 

CPF 

0.00040776 

-0.02513683 

-0.00067285 

SDCI+Q 

-0.00133646 

-0.00493371 

0.00031155 

CASSCF 

-0.00625527 

-0.01293716 

-0.00553361 

MRCI 

-0.00045866 

-0.00062996 

-0.00002432 

MRCI(BIG) 

-0.00047153 

-0.00060722 

-0.00013494 

MRCI-rQ 

-0.00033712 

-0.00010991 

-0.00014079 

MRCI(BIG)+Q 

-0.00029993 

-0.00016385 

-0.00001821 


DZP Basis 


SCF 

0.04818368 

0.14719287 

-0.01399544 

SDCI 

0.01446939 

0.06015405 

0.00428998 

MCPF 

0.00260180 

0.01330525 

-0.00036150 

CPF' fl 

0.00219774 

0.00060737 

-0.00072896 

CPF 

0.00171006 

-0.01169217 

-0.00078678 

SDCI+Q 

0.00101197 

0.00845456 

0.00186908 

CASSCF 

-0.01478520 

-0.02741322 

-0.00786048 

MRCI 

“-3.00116640 

-0.00194520 

-0.00007021 

MRCI(BIG) 

-0.00126231 

-0.00186384 

-0.00027800 

MRCI+Q 

-0.00077575 

-0.00046378 

-0.00035453 

MRCI(B1G)+Q 

-0.00074106 

-0.00050181 

-0.00022757 


0 The Chong-Langhoff implemention of CPF (llj, which for open shell systems 
differs from that of Ahlrichs et al. [7]. 
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Table VII. Energy differences (au) between the FCI and different levels of correlation 
treatment for the 2 Ai state of NH 2 . 


DZ Basis 


Method 

1.5*r e -r e 

2*r e -r e 

N-*-H 2 -r e 

SCF 

0.04031524 

0.10267371 

-0.00081533 

SDCI 

0.00769570 

0.02826419 

0.00878918 

MCPF 

0.00190896 

-0.00154370 

-0.00026974 

CPF' Q 

0.00181481 

-0.01992232 

0.00144354 

CPF 

0.00185597 

-0.01569280 

0.00137788 

SDCI+Q 

0.00027680 

-0.00537750 

0.00480983 

CASSCF 

r 0.00012341 

-0.01449361 

-0.01383588 

MRCI 

0.00032109 

-0.00043952 

-0.00034228 . 

MRCI(BIG) 

0.00010478. 

-0.00027385 

-0.00013077 * 

MRCI-rQ 

-0.00133786 

0.00130498 

0.00157288 

MRCI(BIG)+Q 

-0.00009474 

0.00027806 

0.00044224 


DZP Basis 


SCF 

0.03059766 

0.09251729 

0.00008646 

SDCI 

0.00961563 

0.04019126 

0.01381379 

MCPF 

0.00261021 

0.00357473 

0.00321466 

CPF' Q 

0.00253934 

-0.01826289 

0.00181431 

CPF 

0.00260866 

-0.01859546 

0.00171334 

SDCI+Q 

0.00178540 

0.00626841 

0.00860488 

CASSCF 

-0.00964579 

-0.02534059 

-0.01307699 

MRCI 

0.00000590 

-0.00166202 

-0.00076318 

MRCI(BIG) 

-0.00039130 

-0.00142503 

-0.00044209 

MRCI+Q 

-0.00190407 

0.00143620 

0.00193059 

MRCI(BIG)+Q 

-0.00041014 

-0.00010893 

0.00023031 


0 The Chong-Langhoff implemention of CPF [llj, which for open shell systems 
differs from that of Ahlrichs et al. [7]. 
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Table VIII. T e ’s relative to the full CI. 


Method 

* 

DZ 

DZP 

SCF 

-0.004223 

-0.004042 

SDCI 

-0.000273 

-0.000521 

MCPF 

0.000053 

-0.000075 

CPF' a 

0.000043 

-0.000078 

CPF 

0.000059 

-0.000066 

SDCI+Q 

0.000170 

0.000046 

CASSCF 

0.006356 

0.005827 

MRCI 

0.000078 

0.000483 

MRCI(BIG) 

-0.000107 

0.000025 

MRCI+Q 

-0.001476 

-0.001835 

MRCI(BIG)+Q 

-0.000461 

-0.000569 

FCI b 

0.042624 

0.053858 


0 The Chong-Langhoff implemention of CPF [11], which for open shell systems 
differs from that of Ahlrichs et al. [7], 
b Full CI T e . 
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Abstract 


The electron affinity of oxygen is computed to be 1.287 eV, using 2p electron 
full Cl wave functions expanded in a 6s5p3d2f Slater-type orbital basis. The best 
CASSCF-MRCI result including only 2p correlation is 1.263 eV. However, inclusion 
of 2s intrashell and 2s2p intershell correlation increases the computed EA to 1.290 
at the CASSCF-MRCI level. At the full Cl basis set limit, the 2s contribution to the 
electron affinity is estimated to be as large as 0.1 eV. This study clearly establishes 
the synergistic effect between the higher excitations and basis set completeness on 
the electron affinity when the 2s electrons are correlated. 
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I. Introduction 

The calculation of the electron affinity (EA) of the oxygen atom has proved to 
be a challenging task [1-5]. A noteworthy study of atomic correlation and its effects 
on calculated electron affinities is the work of Sasaki and Yoshimine [l] (SY). Using 
extremely large Slater basis sets, they were able to estimate the SDCI basis limit 
for the electron affinity of oxygen as 1.041 eV, which is substantially less than the 
experimental value [6] of 1.462 eV. This work indicated that earlier pair-correlation 
calculations [7-10] obtained electron affinities in fortuitously good agreement with 
experiment owing to a cancellation between atomic basis set incompleteness and the 
excess energy from neglected pair-pair interaction energies. By including selected 
triple and quadruple excitations, SY obtained an electron affinity of 1.17 eV includ- 
ing only L shell correlation. SY also showed that correlation of the Is electrons 
makes a very small contribution to the EA. 

The study of Botch and Dunning [2] (BD), demonstrates that the differen- 
tial higher excitation contribution to the EA is more efficiently accounted foi; by 
an MCSCF multi-reference Cl procedure, MRCI, than by using selected triple and 
quadruple excitations from an SCF reference. Their MCSCF calculation was re- 
stricted to double excitations out of the 2p into 2p' correlating orbitals, and yielded 
an EA of 0.46 eV, about 1 eV improvement over the Hartree-Fock limit of -0.54 eV. 
When all single and double excitations from this MCSCF reference were included, 
they obtained an electron affinity of 1.09 eV. This value is less than that of SY 
owing to the much smaller basis set employed, but indicates a considerably larger 
contribution from higher excitations. The most important correlation contribution 
was observed to arise from 2p— »2p' excitations. 

The first systematic study of the EA of oxygen correlating only the 2p electrons 
was given by Bauschlicher [3]. Using a large (6s6p3d2f) Slater-type orbital (STO) 
basis that is within about 0.05 eV of the SDCI basis set limit value, a CASSCF 
calculation with the 2p and 2p' orbitals in the active space was performed. This 
CASSCF yields an EA of 0.59 eV which is similar to the EA from the MCSCF 
calculation of Botch and Dunning [2], More extensive correlation was included via 
a second-order Cl including only the 2p electrons. The second-order Cl yields an 
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EA of 1.26 eV, which is larger than that of SY, but still in error by 0.2 eV. When 
the MCSCF and MRCI reference spaces were expanded to include the 3d shell, the 
EA increased to only 1.28 eV. 

Recently, Feller and Davidson [4] (FD) calculated the EA of O using an 
MCSCF-MRCI approach. Unlike Bauschlicher [3], FD explicitly included the 2s 
in both the MCSCF and multi-reference Cl calculations. The gaussian type orbital 
(GTO) basis used by FD is within 0.02 eV of the SY SDCI limit EA. The results 
of FD parallel those of Bauschlicher, and their best EA is 1.29 eV, or 1.32 eV if an 
estimate of higher excitations is made. This work suggests that 2s does not have a 
differential correlation contribution to the EA, and therefore either the differential 
contribution of higher excitations to the EA converges very slowly with basis set, 
or else none of the MCSCF-MRCI studies to date have properly accounted for this 
2s contribution. This latter possibility seems unlikely considering the stability of 
the EA to further improvements in the treatment, e.g. including the 3d shell in the 
active space. However, Raghavachari [5] finds with a comparable basis set that a 
coupled cluster doubles (CCD) calculation with a correction for single and triple 
excitations yields an EA of 1.41 or 1.36 eV, depending upon the approximation used 
for the single and triple excitations. 

It has recently become possible to perform very large full Cl calculations which 
can be analyzed to separate the effect of higher excitations from basis set incomplete- 
ness. This is a result of: (i) Siegbahn’s realization [11] that the full Cl procedure 
can be vectorized in terms of matrix multiplies, (ii) Knowles and Handy’s sugges- 
tion [12] that the Siegbahn approach be changed to determinants from configuration 
state functions, thus eliminating the 10 bottleneck associated with the formula file, 
and (iii) the delivery of the CRAY 2 which allows very large Cl expansions because 
of its extremely large memory. 

In this work we report full Cl calculations of the EA of O atom, and compare 
these results to those obtained using an MCSCF-MRCI approach. One of the goals 
of the present study is to decompose the remaining error in the EA into effects of ba- 
sis set incompleteness, higher excitations not accounted for by the CASSCF-MRCI 
treatment, and to contributions from correlating the 2s electrons. In addition, we 
report on our initial attempts to compute the EA using Green’s function Monte 
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Carlo methods [13]. This approach seems particularly relevant to the EA affinity 
problem since it is potentially capable of accounting for all of the electronic correla- 
tion energy. However, technical problems make such calculations difficult at present 
for systems with this many electrons. 

Section II contains a brief description of the methods and basis sets employed 
in this study. Section III contains an analysis of the full Cl and CASSCF-MRCI 
calculations. Section IV contains a description of the Monte Carlo calculations. 
The conclusions are given in Section V . 

II. Methods 

In our theoretical calculations we employ a 6s5p valence Slater-type orbital 
set obtained by combining six s functions optimized for O, and five p functions 
optimized for O - [14]. The total SCF energy of the 6s5p basis is only 0.0001 
Hartree above the numerical Hartree Fock (NHF) energy; this can be compared to, 
for example, an uncontracted 12s7p GTO basis set [15] which has an error six times 
larger. However, both O and O - are affected similarly by basis set limitations, so 
the EA is at the HF limit. 

To the 6s5p valence basis we construct a small basis by adding one 3d function 
with an exponent of 2.66 and a larger 3d2f polarization set with exponents of 4.0, 
2.8284 and 2.0 for the three 3d functions and exponents of 4.06 and 2.87 for the 
— two 4f functions. The exponents were optimized by Bauschlicher [3], under an even- 
tempered constraint, by minimizing the mean of the O and O - energies at the SDCI 
level, with only the 2p electrons correlated. However, the optimal exponents are 
not significantly different for O and O - and hence little bias is expected for the 
larger polarization set. 

Several zeroth-order references are used. The simplest is the SCF, in which 
symmetry and equivalence restrictions are imposed. For the MCSCF reference 
spaces, we use the CASSCF approach. In the smaller CASSCF only the 2p elec- 
trons are active, and are distributed in the 2p and 2p' orbitals; this CASSCF wave 
function is denoted CASSCF(2p). Since the question of the correlation effect of the 
O 2s is of interest, a second CASSCF wave function is also used, CASSCF(2s2p), 
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which includes the 2s as active and adds a 2s' correlating orbital. 

In order to analyze the various correlation effects, more extensive correlation is. 
added to the zeroth-order references in several ways. To separate the contribution 

N 

to the EA from the 2s and 2p electrons, only the 2p electrons are correlated in some 
calculations, denoted CI(2p), while in others both the 2s and 2p are correlated, 
denoted CI(2s2p). For the SCF reference, only single reference SDCI calculations, or 
full Cl calculations, FCI, are performed; the calculations are denoted CI(SCF,2s2p), 
for a single reference Cl calculation (using SCF orbitals) which correlated the 2s 
and 2p electrons, or FCI(2p) for a full Cl calculation which correlates only the 2p 
electrons. For the single reference Cl calculations, we use the Davidson correction 
[16] (denoted +Q) to estimate the importance of quadruple excitations. For the 
CASSCF optimized orbitals, all calculations consist of single and double excitations 
from all of the configurations in the CASSCF calculation. For these calculations 
the notation indicates the origin of the orbitals and which electrons are correlated. 
Thus CI(CASSCF(2p),2s2p) denotes a calculation using orbitals optimized in a 
CASSCF(2p) calculation, and which includes single and double excitations out of 
the 2s and 2p orbitals in all configurations in the CASSCF (2p) wave function, 
while CI(CASSCF(2p),2p) involves the same orbitals and same references, but the 
2s electrons are not correlated. 

III. Results and discussion 

The improvement of the electron affinity with enhancements of the polarization 
basis at the CI(SCF,2p) level is shown in Table I. These results using the 6s6p 
valence basis [3] are essentially unchanged if the valence basis is replaced by the 
6s5p set used this study. The three d functions contribute a substantial 0.338 eV at 
this level, whereas the two f functions contribute 0.055 eV. For the 6s6p3d2f STO 
basis, the CI(SCF,2s2p) EA is 0.993eV (0.991 eV, for the 6s5p3d2f STO basis). This 
is about 0.05 eV less than the Sasaki and Yoshimine [l] CI(SCF,2s2p) EA of 1.041 
eV, which should be near the basis set limit at this level. Feller and Davidson find 
a very similar CI(SCF,2s2p) EA using a 4d2f GTO basis. They also add a single g 
function, which is not optimized; this increases the CI(SCF,2s2p) EA by 0.037 eV, 
but increases the CI(CASSCF(2s2p),2s2p) EA by only 0.019 eV. The optimization 
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[3] of the d and f polarization function shows that the O energy is more sensitive to 
the choice of exponents than is O - . Thus any error in the choice of the polarization 
function exponents tends to lead to too large an EA. Therefore we conclude that 
the 0.037 eV g function contribution is too large, and that 0.02 eV is probably a 
better estimate, with the saturation of the d and f spaces being of about the same 
importance. 

The EA at various levels of correlation treatment using the 6s5pld and 6s5p3d2f 
STO bases are summarized in Table II. Consider first the 6s5pld basis set results 
where we have been able to perform the FCI(2s2p) calculation. The difference of 
0.13 eV between the Cl(SCF,2p) and CI(SCF,2s2p) electron affinities suggests a 
substantial reduction from 2s correlation. However, the reduction from including 
2s correlation is only 0.085 eV if a correction is added for quadruple excitations. At 
the full Cl level this reduction is only 0.027 eV. Clearly the importance of including 
the 2s changes markedly with increasing excitation level. 

Since certain classes of higher excitation can be included efficiently using the 
MCSCF-MRCI approach, we next considered calculations from a CASSCF refer- 
ence. When only 2p correlation is included, the CI(CASSCF(2p),2p) EA is only 
0.011 eV less than the FCI value. If the 2s correlation is included for this choice 
of reference space, CI(CASSCF(2p),2s2p), the EA of 1.025eV is 0.034 eV less than 
the FCI value. If the same orbitals are used, but the Cl reference space is in- 
creased to include all distributions of the 2s and 2p electrons in the 2s, 2p and 
2p' orbitals, the EA is increased by only 0.003 eV. Thus to improve the computed 
EA, more orbitals must be included in the CASSCF orbital space and Cl reference 
space. While the inclusion of the 2s electrons and the 2s' orbital in the CASSCF 
improves the CASSCF(2s2p) EA by 0.157 eV, relative to the CASSCF(2p) EA, the 
CI(CASSCF(2s2p),2s2p) EA is only 0.011 eV larger than the CI(CASSCF(2p),2s2p) 
EA, and is thus still in error by 0.023 eV. 

The importance of higher excitations is considerably enhanced for the larger 
6s5p3d2f basis. Including 2s correlation in the SCF reference SDCI decreases the EA 
by only 0.073 eV, and by only 0.017 eV when the Davidson correction for unlinked 
quadruples is added. The same coupling of basis set effects and the importance of 
higher excitations is illustrated by the difference between the CI(SCF,2p) and the 
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FCI(2p) electron affinities, which is 0.223 eV for the 6s5p3d2f STO basis and 0.183 
eV for the 6s5pld STO basis. The difference between the CI(CASSCF(2p),2p) 
and FCI(2p) EA also increases with the basis set improvement, being 0.024 eV for 
the larger basis compared to 0.011 eV for the Id basis set. The difficulty of fully 
accounting for the effect of higher excitations is much more pronounced when the 2s 
electrons are correlated. For example, Feller and Davidson using a comparable basis 
obtain essentially the same CI(SCF,2s2p) EA, but their selected-reference MRCI 
calculation, based on CASSCF(2s2p) orbitals and correlating 2s and 2p, obtains 
1.229 eV, compared to our CI(CASSCF(2s2p),2s2p) result of 1.290 eV. Thus the 
FD selection of references compared to our use of all CASSCF configurations as Cl 
references has a substantial effect, even for a coefficient selection threshold of 0.01. 
Note that the CI(CASSCF(2s2p),2s2p) calculation for 0“ in the 6s5p3d2f basis 
consists of all single and double excitations from 588 CSF’s yielding a total 545,952 
CSF’s in D 2 h. symmetry. The motivation for reference selection is clearly evident, 
but leads to ambiguities for further improvements in the CASSCF treatment. For 
example, inclusion of the 3d orbital in the CASSCF treatment makes some selection 
of the Cl references mandatory to keep the computations tractable. The relatively 
small affect of this extension is difficult to assess considering the effect of reducing 
the Cl reference space. 

The positive contribution of the 2s correlation at the CASSCF-MRCI level 
using the 6s5p3d2f basis is consistent with the trends observed *t the CI(SCF) 
and CI(SCF)-fQ levels as the basis set is improved. The'COntribution of the 2s 
correlation increases but is more difficult to account for as the basis set size is in- 
creased. At present, we are unable to perform the FCI(2s2p) calculations for the 
6s5p3d2f basis set, since this involves an expansion consisting of 488 million de- 
terminants and about 4 billion intermediate states. However, the FCI(2s2p) EA 
can be estimated assuming that the difference CI(CASSCF(2p),2s2p)-FCI(2s2p) 
or CI(CASSCF(2s2p),2s2p)-FCI(2s2p) increases at the same rate as the dif- 
ference CI(CASSCF(2p),2p)-FCI(2p) with basis set improvements. From the 
CI(CASSCF(2p),2s2p) calculation, we estimate the EA to be 1.35 eV, and from 
the CI(CASSCF(2s2p),2s2p) calculations we estimate 1.34 eV. These are probably 
underestimates, because the importance of higher excitations increases faster for 
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the FCI(2s2p) wave function than for FCI(2p) wave function. Thus, the FCI(2s2p) 
EA in the 6s5p3d2f could easily be as large as 1.36 eV. The remaining error of 0.1 
eV can be rationalized in terms of the synergistic effect of basis set incompleteness 
and the contribution of higher excitations. That is, a basis set incompleteness error 
of 0.05 eV at the SDCI level becomes twice as large at the FCI(2s2p) level. These 
arguments also imply a relatively large positive differential contribution of the 2s 
correlation at the FCI level of 0.07 eV in the 6s5p3d2f basis, and quite possibly 0.1 
eV in a complete one-particle basis. 

Our theoretical results are reasonably consistent with a recent study of the 
EA of oxygen using Moller-Plesset perturbation theory by Raghavachari (5j. In 
particular, the value of 1.36 eV obtained by performing coupled-cluster calculations 
including all double substitutions, augmented by an estimate of the contributions 
of single and triple substitutions from the CCD wave function, CCD+ST(CCD), 
is in good agreement with our FCI(2s2p) result. When the ST contribution is 
estimated from fourth-order Moller Plesset theory, CCD+ST(MP4), the value of 
1.41 eV is probably too large. It would, however, be of considerable interest to, see 
a more exact comparison of CCD calculations and the full Cl results. This would 
resolve whether the excitations neglected in CCD make only a small contribution 
or whether the good agreement results from a cancellation of errors. 

IV. Green’s Function Monte Carlo 

Although our full Cl calculations give insight into what the computational re- 
quirements are for computing an electron affinity of oxygen with chemical accuracy, 
we presently cannot perform these calculations. Since the EA and the differential 
correlation energy (of 2.0 eV), are not expected to converge more rapidly than the 
total valence correlation energy, a calculation accounting for over 95% of the valence 
correlation energy is required to produce an EA to within 0.1 eV. Therefore, meth- 
ods such as released node Green’s function Monte Carlo [13], that can in principle 
account for all of the correlation energy, would seem to be especially appropriate 
for this problem. 

Using a vectorized implementation of Green’s function Monte Carlo on the 
Cyber-205, we describe here our initial attempts to compute the EA of oxygen 
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in collaboration with David Ceperley. The trial wavefunction used to guide the 
random walk consisted of a Slater determinant times a pair correlation function 
(Slater-Jastrow). The trial function only affects the statistical error of the energy, 
and not its limiting value. The atomic orbitals were determined from an SCF 
calculation using a 4s3p STO basis set. The EA computed at the fixed node level 
is 1.137±0.063 eV. The nodal release procedure [17] employed, however, did not 
converge. The calculation was still far from convergence (especially for oxygen) 
after 42 generations [13,17]. It is impractical to continue the calculation further 
as the expectation values would become increasingly noisy and the total number 
of walks grows geometrically. This indicates that either a substantially better trial 
function is required, that is, the nodes need to be more accurately positioned by 
the trial function so that the relaxation to the correct nodes occurs more quickly, 
or that a more efficient nodal release procedure is required. We conclude that it is 
presently not feasible to compute an accurate EA for a system as heavy as oxygen 
using released node Green’s function Monte Carlo. We present this problem as a 
challenge to future developments of the method. 

Recently Barnett, Reynolds and Lester [18] reported a calculation of the EA of 
fluorine using fixed-node Monte Carlo. They obtained over 90% of the correlation for 
both the neutral and the anion, and an electron affinity of 3.45±0.11 eV in excellent 
agreement with experiment. A single determinant, constructed with a double-zeta 
basis set, multiplied by electron-electron and electron-nuclear Jastrow functions 
were used as importance functions. These, results contradict our experience with 
oxygen and the concept that methods which obtain 90% of the total correlation 
energy, such as CASSCF-MRCI, should yield 90% of the differential correlation 
contribution to the EA, which should lead to an error of about 0.2 eV for the EA 
of both O and F. Perhaps their trial function fortuitously places the nodes better 
for F“, or some bias is introduced by the extrapolation to a zero time step. 

V. Conclusions. 

The CASSCF-MRCI and full Cl calculations reported here show that higher 
excitations become more important as the one-particle basis set is improved. At 
high levels of correlation treatment, correlation of the 2s electrons makes a positive 
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contribution to the electron affinity of oxygen. In fact, we estimate that at the 
full Cl basis set limit the 2s contribution could be as large as 0.1 eV. Reduction 
of the CASSCF reference space from which the MRCI is carried out is found to 
significantly affect the electron affinity, even for a selection threshold of 0.01 on the 
coefficients. Our attempts to compute a quantitative electron affinity for oxygen 
using both fixed-node and released-node Monte Carlo was not very successful. It 
is hoped that future developments in the released node procedure will significantly 
improve the applicability of Monte Carlo methods to systems as heavy as oxygen. 
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Table I. The computed EA as a function of the addition of polarization functions, 
for the CI(SCF,2p) level, in eV, taken from Reference 3. 


Basis 

EA 

6s6p 

0.676 

Id 

0.912 

2d 

0.996 

3d 

1.014 

3dlf 

1.047 

3d2f 

1.069 

NHF 

-0.54 

EXP 

1.462 a 


0 Reference 6. 
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Table II. Summary of computed EA’s, in eV. 


Calculation 

valence basis 

EA 

NHF 


-0.54 

SCF 6s5p 

6s5pld 

-0.54 

CI(SCF,2p) 


0.903 

CI(SCF,2p)+Q 


1.042 

CI(CASSCF(2p),2p) 


1.075 

FCI(2p) 


1.086 

CI(SCF,2s2p) 


0.771 

Cl(SCF,2s2p)+Q 


0.956 

CI(CASSCF(2p),2s2p) 


1.025° 

CI(CASSCF(2s2p),2s2p) 


1.036 

FCI(2s2p) 

6s5p3d2f 

1.059 

CI(SCF,2p) 


1.064 

CI(SCF,2p)+Q 


1.217 

CI(CASSCF(2p),2p) 


1.263 fc 

FCI(2p) 


1.287 

CI(SCF,2s2p) 


0.991 

CI(SCF,2s2p)+Q 


1.200 

CI(CASSCF(2p),2s2p) 


1.277 

CI(CASSCF(2s2p),2s2p) 


1.290 

EXP 


1.462 c 

0 If the Cl calculations are modified to include single and double excitations from all 
possible distributions of the 2s and 2p electrons among the 2s, 2p and 2p' orbitals 


the EA is increased to 1.028eV. 

b If the MCSCF and MRCI are expanded to include the 3d orbital as active, the 
EA is increased by 0.017eV. 
c Reference 6. 


14 





Volume 126, number 5 


CHEMICAL PHYSICS LETTERS 


16 May 1986 


A FULL Cl TREATMENT OF Ne ATOM 

- A BENCHMARK CALCULATION PEREORMED ON THE NAS CRAY 2 

Charles W. BAUSCHLICHER Jr., Stephen R. LANGHOFF 

NASA Ames Research Center, Moffett Field, CA 94035, USA 

Peter R. TAYLOR 

FLORET Institute } , Sunnyvale, CA 94087, USA 


and 

Harry PARTRIDGE 

Research Institute for Advanced Computer Science, NASA Ames Research Center ; Moffett Field, CA 94035, USA 


Received 28 January 1986 


Full Cl calculations are performed for Ne atom using Gaussian basis sets of up to triple-zeta plus double polarization 
quality. The total valence correlation energy through double, triple, quadruple and octuple excitations is compared for eight 
different basis sets. These results are expected to be an important benchmark for calibrating methods for estimating the 
importance of, higher excitations. 


1 . Introduction 

In a configuration-interaction (Cl) calculation, the 
electronic correlation energy is obtained through a 
double basis set expansion [1] . The one-particle mo- 
lecular orbitals are first expanded in an atomic basis 
set, and the /7-particle basis set is then expanded in 
determinants (or a spin and space symmetry adapted 
linear combination of determinants, namely configu- 
ration state functions, CSFs). Although one can ob- 
tain the one-particle basis limit for uncorrelated self- 
consistent -field, SCF, wavefunctions,it is impossible 
to reach the basis set limit for full configuration- 
interaction, FCI, wavefunctions. Presently, the most 
common approach for including electron correlation 
is to include all single and double (or perturbation 
theory selected) excitations from a zeroth -order space 
consisting of the most important configurations (see 

1 Mailing address: NASA Ames Research Center, Moffett 
Field, CA 94035, USA. 


refs. [1,2], and references therein). For such wave- 
functions, it is now possible to use very large one-par- 
ticle basis sets. Thus, the calculations are limited in 
accuracy primarily by the truncation of higher exci- 
tations from the /7-particle space. Various methods 
have been proposed to estimate the importance of 
higher excitations both on the energy and on proper- 
ties [3—5]. 

The factorial increase in the number of CSFs with 
excitation level has limited Cl calculations which ac- 
count explicitly for higher than double excitations to 
small one-particle basis sets. Of particular significance 
are the FCI calculations of Handy and co-workers 
[6,7] . These calculations have been useful in calibrat- 
ing the effect of higher excitations, but the small atom- 
ic basis sets employed have resulted in rather small to- 
tal correlation energies. However, with the advent of 
super computers such as the CRAY 2, with is combi- 
nation of large memory and vectorized matrix multi- 
ply capabilities, it is possible by exploiting recent de- 
velopments in the FCI methodology to consider FCI 
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calculations in larger basis sets. The first such develop- 
ment was Siegbahn’s realization [8] that the FCI could 
be vectorized in terms of matrix multiplies if the two- 
electron coupling coefficients are expanded as prod- 
ucts of one-electron matrix elements. Further, Knowles 
and Handy showed [9] that of the FCI wavefunction 
is expanded in determinants instead of CSFs, matrix 
elements can be easily constructed as needed (with in- 
creased memory requirements), thereby avoiding a for- 
mula tape and greatly decreasing the input/output 
(I/O) operations. This formulation of the FCI prob- 
lem is ideal for the CRAY 2 which has extensive mem- 
ory and a matrix multiply performance in excess of 
the CRAY XMP. Thus by using an implementation of 
the Knowles and Handy full Cl procedure on the 
Numerical Aerodynamic Simulation (NAS) Project 
CRAY 2, we have performed benchmark FCI calcula- 
tions on the Ne atom using Gaussian basis sets of up 
to triple-zeta plus double polarization quality corre- 


lating the eight valence electrons. The resulting corre- 
lation energy is more than twice that of double-zeta 
basis sets used in previous full Cl calibration calcula- 
tions. For comparison we have performed convention- 
al Cl calculations incorporating up through quadruple 
excitations to assess the relative importance of differ- 
ent excitation levels. 


2. Methods 

Three valence basis sets are used in this work. The 
first two use the Huzinaga (9s5p) primitive set [10], 
contracted to [4s2p] and [5s3p] following Dunning 
[11]. The third basis is a [6s4p] contraction of van 
Duijneveldt’s (1 ls7p) primitive basis set [12]. Since 
these calculations are intended for calibration, the 
basis sets are given explicitly in table 1 . To these va- 
lence basis sets, one and two sets of d functions are 


Table 1 

The GTO basis sets a ) 


Function 

s 

P 

9s5p/4s2p b) 

1 

12100.0000(0.001200) 

56.4500(0.020875) 

2 

1821.0000(0.009092) 

12.9200(0.130032) 

3 

432.8000(0.041305) 

3.8650(0.395679) 

4 

132.5000(0.137867) 

1.2030(0.621450) 

5 

43.7700(0.362433) 

0.3444(1.000000) 

6 

5.1270(0.130035) 


7 

14.9100(1.000000) 


8 

1.4910(1.000000) 


9 

0.4468(1.000000) 


Ils6p/6s4p 

1 

47479.00000(0.000219) 

155.15100(0.003157) 

2 

7066.93000(0.001708) 

36.45440(0.023920) 

3 

1603.00000(0.008936) 

11.42280(0.098494) 

4 

450.72400(0.036608) 

4.11803(0.251086) 

5 

146.1 3900(0.1 18542) 

1.55464(1.000000) 

6 

52.42280(0.285128) 

0.57919(1.000000) 

7 

20.26510(1.000000) 

0.20612(1.000000) 

8 

8.14482(1.000000) 


9 

2.41510(1.000000) 


10 

0.92900(1.000000) 


11 

0.33687(1.000000) 



a ) The total SCF energies are [4s2p] = -128.522354, [5s3p] = -128.524013, and [6s4p] = -128.543823 hartree. 

b) The 5s3p contraction is obtained by un contracting s primitive number 6 and p primitive number 4. 
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added. The exponents for the d functions are taken 
from Ahlrichs et al. [13] ; for the one d basis we used 
ot = 2.15, and for the 2d basis we used a = 4.5 and 
1 .3. The 3s component of the d functions is deleted 
in all calculations. 

The orbitals are optimized at the SCF level for the 
1 S state of Ne atom in D 2 h symmetry. The Is orbital 
is frozen in all correlated wavefunctions. To estimate 
the importance of different excitation levels, conven- 
tional Cl calculations are performed on the Cyber- 
205 that include all CSFs through doubles (SD), 
through triples (SDT) and through quadruples 
(SDTQ). Recently, Ahlrichs and co-workers [5] have 
proposed the coupled pair functional (CPF) method 
to account for the importance of higher excitations, 
and have reported [14,15] impressive results for se- 
lected molecules containing first-row atoms. Hence, 
we also include for comparison the CPF results as 
well as the results of the frequently used Davidson 
correction [3] for unlinked quadruple excitations. 

We compute the importance of a given excitation 
level from the difference between it and the next 
lower level. For example, the importance of triples 
is computed as the difference between CI(SDT) and 
CI(SD). We do not decompose the difference between 
the CI(SDTQ) and the CI(FCI), thus the quintuple 
through octuple excitations are combined. 

3. Results and discussion 

The breakdown of the correlation contribution by 


excitation level is summarized for the eight Gaussian 
basis sets in table 2. Since the HF reference is a good 
zeroth-order description of the ground state of Ne 
atom, the correlation energy is dominated by the 
double excitations, which account for over 96% of 
the correlation energy . The contribution from the 
triple excitations is significant varying from 45% to 
60% of the quadruples contribution. The fractional 
contribution from five-fold and higher excitations is 
very small and tends to decrease with increasing basis 
set quality. For example, the contribution of higher 
than double excitations (3 .7-3 .8% of the total cor- 
relation energy) is nearly identical for the [4s2p] and 
[5s3p2d] basis set, but the percent contribution from 
five-fold and higher excitations for the larger basis 
(0.08%) is only about half that of the [4s2p] basis. 

The correlation energies obtained at various exci- 
tation levels are summarized for the Gaussian basis 
sets in table 3. This table again illustrates how close 
is the energy through quadruples to the full Cl ener- 
gy . Table 3 also compares the quadruples contribu- 
tion to the correlation energy obtained using the 
coupled pair functional (CPF) approach and the 
Davidson correction for unlinked quadruple excita- 
tions. The CPF approach accounts for only about 
40% of the contribution from higher than double ex- 
citations for the [4s2p] basis set. This underestima- 
tion arises in part from the fact that the CPF approach 
does not account for the sizable contribution from 
triple excitations and also does not account fully for 
the quadruple excitations. In contrast, the Davidson 
correction is larger, accounting for about 90% of the 


Table 2 

Comparison of percent correlation energy contribution by excitation level 


Basis 

[4s2p] 

[5s3p] 

[6s4p] 

[4s2pld] 

singles + doubles 

96.22 

96.25 

96.07 

96.68 

triples 

1.36 

1.31 

1.41 

1.05 

quadruples 

2.25 

2.34 

2.41 

2.15 

quintuples-octuples 

0.17 

0.10 

0.10 

0.11 


[5s3pld] 

[6s4pld] 

[4s2p2d] 

[5s3p2d] 

singles + doubles 

96.60 

96.41 

96.46 

96.27 

triples 

1.03 

1.12 

1.25 

1.33 

quadruples 

2.30 . 

2.38 

2.18 

2.32 

quintuples -octuples 

0.07 

0.08 

0.11 

0.08 
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Table 3 

Comparison of correlation energies and methods of estimating higher excitations 


Basis 

[4s2p] 

[5s3p] 

[6s4p] 

[4s2pld] 

£(SD) 

-129.621884 

-128.658764 

-128.682712 

-128.696487 

E{ SD) -£( SCF) 

0.099530 

0.134751 

0.138889 

0.174133 

£(SDT) - £(SD) 

0.001406 

0.001836 

0.002041 

0.001898 

£(SDTQ) - £(SD) 

0.003735 

0.005112 

0.005537 

0.005779 

£(FCI) - E{ SD) 

0.003907 

0.005247 

0.005687 

0.005975 

£(SDTQ) - £(SDT) 

0.002329 

0.003276 

0.003496 

0.003881 

£(CPF) - £(SD) 

0.001617 

0.002157 

0.002400 

0.003539 

£(DVD) a) - £(SD) 

0.002114 

0.002808 

0.003101 

0.004675 


[5s3pld] 

[6s4pld] 

[4s2p2d] 

[5s3p2d] 

£(SD) 

-128.735002 

-128.759855 

-128.719125 

-128.759746 

£(SD) - £(SCF) 

0.210989 

0.216032 

0.196771 

0.235733 

£(SDT) - £(SD) 

0.002242 

0.002518 

0.002557 

0.003258 

£(SDTQ) - £(SD) 

0.007266 

0.00.7850 

0.006996 

0.008928 

£(FCI) - £(SD) 

0.007429 

0.008034 

0.007219 

0.009131 

£(SDTQ) - £(SDT) 

0.005024 

0.005332 

0.004402 

0.005270 

£(CPF) - £(SD) 

0.004329 

0.004670 

0.004402 

0.005276 

£(DVD) - £(SD) 

0.005606 

0.006004 

0.005728 

0.006823 


a ) Davidson's estimate of higher excitations, ref. [3]. 


quadruples contribution for the [4s2p] basis. For the 
larger basis sets including polarization functions, this 
picture changes somewhat. The CPF now accounts 
for about 60% of the correlation from higher than 
double excitations, since it now accounts for the dom- 
inant portion of the correlation contribution from 
quadruple excitations. Also, for the larger basis sets 
the Davidson correction considerably overestimates 
the contribution from quadruple excitations. 

The largest basis set considered in this study, 
[5s3p2d], accounts for about 75% of the valence- 
shell correlation energy. The difficulty in extending 
the FCI technique to still larger basis sets is illustrated 
by the following numbers. For the [5s3p2d] basis in 
D 2h symmetry, there are 462 CSFs through double 
excitations, 6706 CSFs through triple excitations, 
62234 CSFs through quadruple excitations, and 
2360757 CSFs through octuple excitations. In addi- 
tion, 2360757 CSFs result in 9805897 determinants 
and 78411025 “intermediate states” [7,8], and the 
addition of a single f basis function would increase 
this to about 20 million CSFs. The 2.36 million SCF 
calculation presently takes about 22.8 min per itera- 
tion on the CRAY 2. The present rate limiting step 


is matrix multiply (MXM) which is currently running 
at more than 250 Mflops. Recently Calahan and co- 
workers [16] have implemented an unrolled matrix 
multiply which has achieved 385 Mflops on our sys- 
tem, thus we expect improved performance in the 
future . 


4. Conclusions 

Full Cl calculations have been performed for l S 
Ne atom using Gaussian basis sets of up to triple-zeta 
plus double polarization quality. These calculations 
should supplement existing full Cl benchmark calcu- 
lations in that they account for a significantly larger 
amount of correlation energy. Presently we are per- 
forming additional full Cl calculations on such sys- 
tems as H 2 0, HF, O and O - using the NAS CRAY 2; 
this should supply further insight into the magnitude 
and nature of higher excitations. 
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